Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. recursive folder compare

recursive folder compare

Scheduled Pinned Locked Moved C#
sysadminalgorithmsperformance
4 Posts 2 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • V Offline
    V Offline
    vantoora
    wrote on last edited by
    #1

    Dear folks, I have many folders to compare but I don't know how I can make a comparison between them without lacking for memory usage. I explain : I have two servers with around 10TB of data each. I use "xcopy" command (on Windows-DOS) for making the copy incrementaly. The first server have datas changing everytime but the other is just for mirroring. Sometimes I need to check if every folders between the two servers are the same (just the folder). I used IEnumerable/List<>... to do the work but consume CPU usage or memory usage. The structure of the folders are the same so what I need is to compare each structure only. If anybody have an idea (with a sample code or just the algorithm or pseudo-code) I should appreciate it. Many thanks to you all.

    S 1 Reply Last reply
    0
    • V vantoora

      Dear folks, I have many folders to compare but I don't know how I can make a comparison between them without lacking for memory usage. I explain : I have two servers with around 10TB of data each. I use "xcopy" command (on Windows-DOS) for making the copy incrementaly. The first server have datas changing everytime but the other is just for mirroring. Sometimes I need to check if every folders between the two servers are the same (just the folder). I used IEnumerable/List<>... to do the work but consume CPU usage or memory usage. The structure of the folders are the same so what I need is to compare each structure only. If anybody have an idea (with a sample code or just the algorithm or pseudo-code) I should appreciate it. Many thanks to you all.

      S Offline
      S Offline
      SeMartens
      wrote on last edited by
      #2

      Hi, so could you post what you have already? First thought of me was to use threads, this will speed up the execution (maybe). Regards Sebastian

      It's not a bug, it's a feature! Check out my CodeProject article Permission-by-aspect. Me in Softwareland.

      V 2 Replies Last reply
      0
      • S SeMartens

        Hi, so could you post what you have already? First thought of me was to use threads, this will speed up the execution (maybe). Regards Sebastian

        It's not a bug, it's a feature! Check out my CodeProject article Permission-by-aspect. Me in Softwareland.

        V Offline
        V Offline
        vantoora
        wrote on last edited by
        #3

        Thanks for your prompt reply but i think i found another idea. That's getting the list of the first server and try to find it into the second server. If the folder exist, do not care, otherwise log the foldername.fullpath. for example, suppose i have this : S1 : c:\rootfolder_S1\folder1\folder2\folder3 S2 : c:\rootfolder_S2\folder1\folder2\folder3 beginning from c:\rootfolder - get list of folders for one level i got : folder1 - used Path.GetFileNameWithoutExtension(dir) and got : folder1 - use the Path.Combine(S2, foldername) and got : c:\rootfolder_S2\folder1 It's ok and very nice algorithm. But my problem, now, is how can i use this if recursing folders in S1. Using Path.GetFileNameWithoutExtension(dir) will get only the last name of the folder and if combining with S2, will got error or something else. What i'm going to try is check the size of S1 (rootfolder only), and then use the substring(index) before combining with S2, but how can i get the size or lenght of S1. S1 : c:\rootfolder -> should have 13 characters. This is my sample code :

        static void Main(string[] args)
        {
        CombinePaths(args[0], args[1]);
        }

        private static void CombinePaths(string p1, string p2)
        {
        string[] dirs = Directory.GetDirectories(p1);
        foreach (string dir in dirs)
        {
        try
        {
        int index = dir.LastIndexOf("\\") + 1;
        string foldername = dir.Substring(index);
        string combination = Path.Combine(p2, foldername);
        if (!Directory.Exists(combination))
        {
        Console.WriteLine(dir);
        }
        }
        catch (Exception e)
        {
        Console.WriteLine(e.Message);
        }
        Console.WriteLine(Path.GetFileNameWithoutExtension(dir));
        // RecurseFolder(dir,dirs);
        }
        }

        this is what i have for now. I'll reply back if found how to have the number of the character of the source string. Maybe not yet very clear my code but think it is still a draft code. See you later.

        1 Reply Last reply
        0
        • S SeMartens

          Hi, so could you post what you have already? First thought of me was to use threads, this will speed up the execution (maybe). Regards Sebastian

          It's not a bug, it's a feature! Check out my CodeProject article Permission-by-aspect. Me in Softwareland.

          V Offline
          V Offline
          vantoora
          wrote on last edited by
          #4

          So finally here is my final code which is what i expect to have (no lack of memory usage nor cpu usage) :

          static void Main(string[] args) {
          /// args[0] : source server
          /// args[1] : destination server
          ///
          CombinePaths(args[0], args[1]);
          }

          private static void CombinePaths(string S, string D) {
          int indexRoot = S.Length + 1;
          var stack = new Stack<string>();
          stack.Push(S);
          while (stack.Count > 0) {
          string dir = stack.Pop();
          try {
          foreach (string sd in Directory.GetDirectories(dir)) {
          stack.Push(sd);
          // Console.WriteLine(dn.ToString());
          string foldername = sd.Substring(indexRoot);
          string combination = Path.Combine(D, foldername);
          if (!Directory.Exists(combination))
          {
          // Console.WriteLine(combination);
          Console.WriteLine(sd);
          }
          }
          }
          catch (UnauthorizedAccessException e)
          {
          Log.Add(e.Message);
          }
          }
          }

          The principle is this : the program iterate all directories inside the root folder, then parse the length to the subdirectories that it combine with the destination server, to finally check if the folder just listed from the source server exist in the destination server. (I think it some kind of "dir /s" in DOS Command). It is what i expect to have during 9 days but i still need help to optimize my apps. As I've just count now, some of my root folder contains 2,000,000 - 3,000,000 folders inside. So I do not want to iterate all of this but i need to stop at level ten (10) or twenty (20), means i need to specify a deep level of iteration but i don't know how to make it by using this stack<> techniques. Any suggestions are welcome. Thanks for all!

          1 Reply Last reply
          0
          Reply
          • Reply as topic
          Log in to reply
          • Oldest to Newest
          • Newest to Oldest
          • Most Votes


          • Login

          • Don't have an account? Register

          • Login or register to search.
          • First post
            Last post
          0
          • Categories
          • Recent
          • Tags
          • Popular
          • World
          • Users
          • Groups