Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Storing huge numbers of files

Storing huge numbers of files

Scheduled Pinned Locked Moved The Lounge
cryptographyquestionlounge
42 Posts 28 Posters 1 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K Offline
    K Offline
    kalberts
    wrote on last edited by
    #1

    This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

    Richard Andrew x64R P L D P 17 Replies Last reply
    0
    • K kalberts

      This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

      Richard Andrew x64R Offline
      Richard Andrew x64R Offline
      Richard Andrew x64
      wrote on last edited by
      #2

      I can tell you from experience that Windows does not do well with thousands of files in a single directory. You will be much better off distributing them over many sub-dirs. Off Topic: I think it's time for you to choose a user name instead of Member 7989122. :)

      The difficult we do right away... ...the impossible takes slightly longer.

      K D J 3 Replies Last reply
      0
      • K kalberts

        This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

        P Offline
        P Offline
        PIEBALDconsult
        wrote on last edited by
        #3

        What size files? Do they have to be files? What sorts of access? How frequently?

        K 1 Reply Last reply
        0
        • K kalberts

          This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

          L Offline
          L Offline
          Lost User
          wrote on last edited by
          #4

          As far as I know, Windows itself doesn't mind it too much if there are lots of files in a folder. Explorer is an other matter. So you can put lots of files in a folder, but you can never look at them. And FAT32 can only have 65534 files in a folder.

          K 1 Reply Last reply
          0
          • Richard Andrew x64R Richard Andrew x64

            I can tell you from experience that Windows does not do well with thousands of files in a single directory. You will be much better off distributing them over many sub-dirs. Off Topic: I think it's time for you to choose a user name instead of Member 7989122. :)

            The difficult we do right away... ...the impossible takes slightly longer.

            K Offline
            K Offline
            kalberts
            wrote on last edited by
            #5

            Can you provide an explanation of why it would be that way? Or is it at the "gut feeling" level?

            D J D O M 5 Replies Last reply
            0
            • P PIEBALDconsult

              What size files? Do they have to be files? What sorts of access? How frequently?

              K Offline
              K Offline
              kalberts
              wrote on last edited by
              #6

              Why would the size of the files matter? Very few are small enough to fit in the available space of the directory entry. Yes, they are files, by definition. Mostly, new files are added to the directory. This is the most common access. File access is far more infrequent.

              P O J 3 Replies Last reply
              0
              • L Lost User

                As far as I know, Windows itself doesn't mind it too much if there are lots of files in a folder. Explorer is an other matter. So you can put lots of files in a folder, but you can never look at them. And FAT32 can only have 65534 files in a folder.

                K Offline
                K Offline
                kalberts
                wrote on last edited by
                #7

                I hope to persue most users to go for NTFS rather than FAT32. The most common access will be through an application, which will read the directory programmatically. Windows Explorer access can be considered an exception (although not that exceptional!).

                D L 2 Replies Last reply
                0
                • K kalberts

                  This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

                  D Offline
                  D Offline
                  Daniel Pfeffer
                  wrote on last edited by
                  #8

                  IIRC, NTFS uses a B-tree variant to store file names in a directory. This guarantees fast access to a single file, but may slow down access if you are trying e.g. to enumerate all files in the directory. FAT32 has a limit of just under 64K entries. The search is linear. Note that a long filename takes at least two entries - one for the short name and one for the long name. I don't know how exFAT stores directories.

                  Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

                  1 Reply Last reply
                  0
                  • K kalberts

                    I hope to persue most users to go for NTFS rather than FAT32. The most common access will be through an application, which will read the directory programmatically. Windows Explorer access can be considered an exception (although not that exceptional!).

                    D Offline
                    D Offline
                    Dave Kreskowiak
                    wrote on last edited by
                    #9

                    If users would be copying these files to a USB stick for any reason, you may run into a problem as formatting a stick using FAT32 is a distinct possibility.

                    Asking questions is a skill CodeProject Forum Guidelines Google: C# How to debug code Seriously, go read these articles.
                    Dave Kreskowiak

                    U 1 Reply Last reply
                    0
                    • K kalberts

                      This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

                      L Offline
                      L Offline
                      Lost User
                      wrote on last edited by
                      #10

                      File-access; so, mostly reading "files"? A database would give you the most flexibility and performance. --edit You can easily expand Sql Server over multiple servers if need be, with more control over sharding and backups than with a regular filesystem.

                      Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

                      O 1 Reply Last reply
                      0
                      • K kalberts

                        Why would the size of the files matter? Very few are small enough to fit in the available space of the directory entry. Yes, they are files, by definition. Mostly, new files are added to the directory. This is the most common access. File access is far more infrequent.

                        P Offline
                        P Offline
                        PIEBALDconsult
                        wrote on last edited by
                        #11

                        Why would the size of the files matter? -- Because if the "files" are small enough, sticking them in some other cataloging system might be a better idea. Maybe a database, maybe a custom archiving system. Think of things like version control systems. File access is far more infrequent. -- Then just do whatever you want, it won't matter.

                        1 Reply Last reply
                        0
                        • K kalberts

                          Can you provide an explanation of why it would be that way? Or is it at the "gut feeling" level?

                          D Offline
                          D Offline
                          DRHuff
                          wrote on last edited by
                          #12

                          People don’t relate well to numbers and this is a place where camaraderie is important. A name - even an obvious alias will make the interactions more personable. ;)

                          If you can't laugh at yourself - ask me and I will do it for you.

                          C 1 Reply Last reply
                          0
                          • Richard Andrew x64R Richard Andrew x64

                            I can tell you from experience that Windows does not do well with thousands of files in a single directory. You will be much better off distributing them over many sub-dirs. Off Topic: I think it's time for you to choose a user name instead of Member 7989122. :)

                            The difficult we do right away... ...the impossible takes slightly longer.

                            D Offline
                            D Offline
                            Daniel Pfeffer
                            wrote on last edited by
                            #13

                            Richard Andrew x64 wrote:

                            I think it's time for you to choose a user name instead of Member 7989122.

                            I don't know; see Peanuts for [30-Sep-1963 to 04-Oct-1963](https://peanuts.fandom.com/wiki/September\_1963\_comic\_strips). :)

                            Freedom is the freedom to say that two plus two make four. If that is granted, all else follows. -- 6079 Smith W.

                            1 Reply Last reply
                            0
                            • K kalberts

                              This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

                              P Offline
                              P Offline
                              Patrice T
                              wrote on last edited by
                              #14

                              Member 7989122 wrote:

                              If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage?

                              If performance downgrade with number of files in a directory, there is only 1 explanation: The directory is organized as a flat list of files, unsorted. This imply that to find a file, you have to scan the list/directory sequentially. In O(n). If an OS can have the directory sorted in the order you look for (file name), cost of finding a file is in O(log(n))

                              Patrice “Everything should be made as simple as possible, but no simpler.” Albert Einstein

                              1 Reply Last reply
                              0
                              • Richard Andrew x64R Richard Andrew x64

                                I can tell you from experience that Windows does not do well with thousands of files in a single directory. You will be much better off distributing them over many sub-dirs. Off Topic: I think it's time for you to choose a user name instead of Member 7989122. :)

                                The difficult we do right away... ...the impossible takes slightly longer.

                                J Offline
                                J Offline
                                Johnny J
                                wrote on last edited by
                                #15

                                Don't rush him. It's only been a little more than 9 years. He probably needs time to think of one... ;)

                                Anything that is unrelated to elephants is irrelephant
                                Anonymous
                                -----
                                The problem with quotes on the internet is that you can never tell if they're genuine
                                Winston Churchill, 1944
                                -----
                                Never argue with a fool. Onlookers may not be able to tell the difference.
                                Mark Twain

                                1 Reply Last reply
                                0
                                • K kalberts

                                  Can you provide an explanation of why it would be that way? Or is it at the "gut feeling" level?

                                  J Offline
                                  J Offline
                                  Johnny J
                                  wrote on last edited by
                                  #16

                                  I can't explain why that is, but it's quite simple to test. Write a small piece of code that copies an image file into the same directory multiple times. Doesn't have to be 100.000, I think 10-20.000 will suffice. Then try to open that directory with Explorer. That'll give you an idea about the problem. :sigh:

                                  Anything that is unrelated to elephants is irrelephant
                                  Anonymous
                                  -----
                                  The problem with quotes on the internet is that you can never tell if they're genuine
                                  Winston Churchill, 1944
                                  -----
                                  Never argue with a fool. Onlookers may not be able to tell the difference.
                                  Mark Twain

                                  1 Reply Last reply
                                  0
                                  • K kalberts

                                    This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

                                    R Offline
                                    R Offline
                                    realJSOP
                                    wrote on last edited by
                                    #17

                                    Maximum number of files on disk: 4,294,967,295 As already mentioned, the problems will start when you try to browse the disk in question with pretty much any existing application. A better option would be to put the files in a database as blobs. At that point, you'll only have one file on the disk for the database itself. It wouldf also be easier to organize and manage than a complex folder hierarchy.

                                    ".45 ACP - because shooting twice is just silly" - JSOP, 2010
                                    -----
                                    You can never have too much ammo - unless you're swimming, or on fire. - JSOP, 2010
                                    -----
                                    When you pry the gun from my cold dead hands, be careful - the barrel will be very hot. - JSOP, 2013

                                    1 Reply Last reply
                                    0
                                    • K kalberts

                                      This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

                                      N Offline
                                      N Offline
                                      Nelek
                                      wrote on last edited by
                                      #18

                                      We have some directories that contain that big number of files, the record I can remember right now is around 450k files in a folder. They come from long time meassurements that trigger a data file a between 3 and 5 in a minute, each between 1 and 5 Mb. Accessing the directory is slow, changing the order from name to timestamp is slow, moving the directory to another place is slow, getting the properties of the folder is slow, deleting the folder once is not needed anymore is slow. Windwos 10 even slower specially the "folder properties" it needs over 15 minutes to count the files and give the size of the folder. Windows 7 did it in 30 or 40 seconds. We can't move that to FAT drives, due to number limitations as other said. Need to be NFTS.

                                      M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

                                      1 Reply Last reply
                                      0
                                      • K kalberts

                                        I hope to persue most users to go for NTFS rather than FAT32. The most common access will be through an application, which will read the directory programmatically. Windows Explorer access can be considered an exception (although not that exceptional!).

                                        L Offline
                                        L Offline
                                        Lost User
                                        wrote on last edited by
                                        #19

                                        Windows explorer will be your bottleneck ... while you sit and wait while it "builds" a 100k tree view. Odds are, it will "hang". "Reading" directories is not a big deal; how you "display" them is.

                                        It was only in wine that he laid down no limit for himself, but he did not allow himself to be confused by it. ― Confucian Analects: Rules of Confucius about his food

                                        J 1 Reply Last reply
                                        0
                                        • K kalberts

                                          This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

                                          J Offline
                                          J Offline
                                          Jorgen Andersson
                                          wrote on last edited by
                                          #20

                                          What are you saving the files for? How will you access them? And how will you search for them? One at a time, sequential, by date, by name...?

                                          Wrong is evil and must be defeated. - Jeff Ello Never stop dreaming - Freddie Kruger

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups