recursive folder compare
-
Dear folks, I have many folders to compare but I don't know how I can make a comparison between them without lacking for memory usage. I explain : I have two servers with around 10TB of data each. I use "xcopy" command (on Windows-DOS) for making the copy incrementaly. The first server have datas changing everytime but the other is just for mirroring. Sometimes I need to check if every folders between the two servers are the same (just the folder). I used IEnumerable/List<>... to do the work but consume CPU usage or memory usage. The structure of the folders are the same so what I need is to compare each structure only. If anybody have an idea (with a sample code or just the algorithm or pseudo-code) I should appreciate it. Many thanks to you all.
-
Dear folks, I have many folders to compare but I don't know how I can make a comparison between them without lacking for memory usage. I explain : I have two servers with around 10TB of data each. I use "xcopy" command (on Windows-DOS) for making the copy incrementaly. The first server have datas changing everytime but the other is just for mirroring. Sometimes I need to check if every folders between the two servers are the same (just the folder). I used IEnumerable/List<>... to do the work but consume CPU usage or memory usage. The structure of the folders are the same so what I need is to compare each structure only. If anybody have an idea (with a sample code or just the algorithm or pseudo-code) I should appreciate it. Many thanks to you all.
Hi, so could you post what you have already? First thought of me was to use threads, this will speed up the execution (maybe). Regards Sebastian
It's not a bug, it's a feature! Check out my CodeProject article Permission-by-aspect. Me in Softwareland.
-
Hi, so could you post what you have already? First thought of me was to use threads, this will speed up the execution (maybe). Regards Sebastian
It's not a bug, it's a feature! Check out my CodeProject article Permission-by-aspect. Me in Softwareland.
Thanks for your prompt reply but i think i found another idea. That's getting the list of the first server and try to find it into the second server. If the folder exist, do not care, otherwise log the foldername.fullpath. for example, suppose i have this : S1 : c:\rootfolder_S1\folder1\folder2\folder3 S2 : c:\rootfolder_S2\folder1\folder2\folder3 beginning from c:\rootfolder - get list of folders for one level i got : folder1 - used Path.GetFileNameWithoutExtension(dir) and got : folder1 - use the Path.Combine(S2, foldername) and got : c:\rootfolder_S2\folder1 It's ok and very nice algorithm. But my problem, now, is how can i use this if recursing folders in S1. Using Path.GetFileNameWithoutExtension(dir) will get only the last name of the folder and if combining with S2, will got error or something else. What i'm going to try is check the size of S1 (rootfolder only), and then use the substring(index) before combining with S2, but how can i get the size or lenght of S1. S1 : c:\rootfolder -> should have 13 characters. This is my sample code :
static void Main(string[] args)
{
CombinePaths(args[0], args[1]);
}private static void CombinePaths(string p1, string p2)
{
string[] dirs = Directory.GetDirectories(p1);
foreach (string dir in dirs)
{
try
{
int index = dir.LastIndexOf("\\") + 1;
string foldername = dir.Substring(index);
string combination = Path.Combine(p2, foldername);
if (!Directory.Exists(combination))
{
Console.WriteLine(dir);
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
Console.WriteLine(Path.GetFileNameWithoutExtension(dir));
// RecurseFolder(dir,dirs);
}
}this is what i have for now. I'll reply back if found how to have the number of the character of the source string. Maybe not yet very clear my code but think it is still a draft code. See you later.
-
Hi, so could you post what you have already? First thought of me was to use threads, this will speed up the execution (maybe). Regards Sebastian
It's not a bug, it's a feature! Check out my CodeProject article Permission-by-aspect. Me in Softwareland.
So finally here is my final code which is what i expect to have (no lack of memory usage nor cpu usage) :
static void Main(string[] args) {
/// args[0] : source server
/// args[1] : destination server
///
CombinePaths(args[0], args[1]);
}private static void CombinePaths(string S, string D) {
int indexRoot = S.Length + 1;
var stack = new Stack<string>();
stack.Push(S);
while (stack.Count > 0) {
string dir = stack.Pop();
try {
foreach (string sd in Directory.GetDirectories(dir)) {
stack.Push(sd);
// Console.WriteLine(dn.ToString());
string foldername = sd.Substring(indexRoot);
string combination = Path.Combine(D, foldername);
if (!Directory.Exists(combination))
{
// Console.WriteLine(combination);
Console.WriteLine(sd);
}
}
}
catch (UnauthorizedAccessException e)
{
Log.Add(e.Message);
}
}
}The principle is this : the program iterate all directories inside the root folder, then parse the length to the subdirectories that it combine with the destination server, to finally check if the folder just listed from the source server exist in the destination server. (I think it some kind of "dir /s" in DOS Command). It is what i expect to have during 9 days but i still need help to optimize my apps. As I've just count now, some of my root folder contains 2,000,000 - 3,000,000 folders inside. So I do not want to iterate all of this but i need to stop at level ten (10) or twenty (20), means i need to specify a deep level of iteration but i don't know how to make it by using this stack<> techniques. Any suggestions are welcome. Thanks for all!