Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Storing huge numbers of files

Storing huge numbers of files

Scheduled Pinned Locked Moved The Lounge
cryptographyquestionlounge
42 Posts 28 Posters 3 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • K kalberts

    Why would the size of the files matter? Very few are small enough to fit in the available space of the directory entry. Yes, they are files, by definition. Mostly, new files are added to the directory. This is the most common access. File access is far more infrequent.

    J Offline
    J Offline
    JasonSQ
    wrote on last edited by
    #41

    File size is critically important. If you're breaking across the block size by just a little bit, the rest of the block is dead space. Assuming 4k block size and files storing 1K of data. That's 3k of wasted space on disk, per file. If you zip up the files, they'll store much, much more efficiently. We have this problem with hundreds of thousands of small text files. We sweep them up and zip them into archive folders on occasion to clean up the folders and reclaim disk space.

    1 Reply Last reply
    0
    • K kalberts

      This is about file systems in general, although with a primary emphasis on NTFS: If you are expecting to stor a huge number of files - in the order of 100 k or more - on a disk, is there any significant advantage of spreading them over a number of subdirectories (based on some sort of hash)? Or are modern file systems capable of handling a huge number of files in a single level directory?` If there are reasons to distribute the files over a series of subdirectories, what are the reasons (/explanations) why it would be an advantage? Is this differnent e.g among differnt FAT variants, and with NTFS?

      J Offline
      J Offline
      Jim Knopf jr
      wrote on last edited by
      #42

      Explorer does two things. Read the entries and sort them. Looks like reading is linear and sorting too. So you get N^2 time behavior. This didn’t change for decades. Some file systems allow accessing the files with a kind of pointer, avoiding the directory once you know the pointer. Nevertheless, adding and deleting files still has to touch the directory. Looking up directory names has the same problem. So better construct a directory “tree”. File size doesn’t matter for name lookups.

      1 Reply Last reply
      0
      Reply
      • Reply as topic
      Log in to reply
      • Oldest to Newest
      • Newest to Oldest
      • Most Votes


      • Login

      • Don't have an account? Register

      • Login or register to search.
      • First post
        Last post
      0
      • Categories
      • Recent
      • Tags
      • Popular
      • World
      • Users
      • Groups