Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. Fast files

Fast files

Scheduled Pinned Locked Moved C / C++ / MFC
c++
27 Posts 7 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H hint_54

    Hi there! I'm writting an app that deals a lot with files (both reading and writting) and that is slowing down my app. What I need is a fast way to do the following operations on files: - read - writte - append - open - close I'm using c++ but inline assembly is an option. Thx!

    K Offline
    K Offline
    kakan
    wrote on last edited by
    #21

    Hello. I would suggest you to: 1. Get the cluster size of the disk you are using. Then create a buffer of that size. Do all reads and writes (if possible) with the cluster size. 2. Turn of stack checking, at least for the functions you use most frequently. 3. Do not, repeat NOT, use time (and cpu) consuming functions in your code. Especially avoid using the (x)printf functions at all times. It's incredibly time and cpu consuming! Another question: You say there shouldn't be any limitations to the file size. You are aware that the f-funcs has a file size limit of about 4 GB? If you want to avoid that limitaion, the you got to use the real Win32-functions CreateFile, ReadFile etc. If you decide to use them instead, then you have the possibility to use overlapped I/O, which might speed up the file I/O. Else, if you stay with the f-funcs, then considder to use the open(), read(), write(), close() e.t.c. They are closer to the file system than the f-funcs (not much, but its worth trying). Kakan

    H 1 Reply Last reply
    0
    • J James R Twine

      DavidCrow wrote:

      A memory-mapped file is a spot in memory that can be accessed (e.g., open, close, read, write, seek) as though it were an actual file.

      Sorry, but I think that is supposed to be the other way around.  Memory-mapping something allows it to be accessed like normal memory (through a pointer).  For example, if you had a memory mapped buttons/switches in your hardware (common on arcade games), you would be able to read the state of the switches by reading values from one or more specific memory addresses.    The same goes for files.  If you memory-map a file, you get an address that points to some location in the file, and you can then access the contents of that file through the pointer.    Peace! -=- James


      If you think it costs a lot to do it right, just wait until you find out how much it costs to do it wrong!
      Tip for new SUV drivers: Professional Driver on Closed Course does not mean your Dumb Ass on a Public Road!
      DeleteFXPFiles & CheckFavorites (Please rate this post!)

      D Offline
      D Offline
      David Crow
      wrote on last edited by
      #22

      James R. Twine wrote:

      Sorry, but I think that is supposed to be the other way around.

      Fair enough. Admittedly I've never used MMF before. Although I did change an application once that was reading a file a few bytes at a time to use CMemFile instead. Talk about a major speed improvement! Processing went from hours (some of the files were hundred MB in size) to just a few minutes.


      "Take only what you need and leave the land as you found it." - Native American Proverb

      1 Reply Last reply
      0
      • J James R Twine

        Using MMF can still improve performance, because you do not have to do the manual loading of data into a buffer - the OS basically does it for you.  For example, if you were reading the file in 4KB chunks, you would be allocating a 4K buffer, copying from the file into that buffer, and then likely processing the contents of the buffer using the buffer's address.  Using a MMF does that work for you.    If you want to impose a limit on the size of the MMF section you want to create, that is fine.  Choose a limit, say 2MB (or 4Mb, or 64MB, whatever).  If the file is 2MB or smaller, you can MM the entire file.  Of not, you can MM 2MB sections of the file one at a time.    Peace! -=- James


        If you think it costs a lot to do it right, just wait until you find out how much it costs to do it wrong!
        Tip for new SUV drivers: Professional Driver on Closed Course does not mean your Dumb Ass on a Public Road!
        DeleteFXPFiles & CheckFavorites (Please rate this post!)

        H Offline
        H Offline
        hint_54
        wrote on last edited by
        #23

        Thx! That has been VERY helfull! :)

        1 Reply Last reply
        0
        • K kakan

          Actually file I/O is never done on a byte level. The least bit of information that can be read from ,or written to a file is one sector. A sector is always an even multiple of 128 bytes, the most common sector size is 512 bytes. So the runtime does buffer (at least) one sector (or more likely, a cluster). I think the main reason for slowing down file reading a byte at the time, is the function call overhead and all the checks that has to be done, in the runtime, before the runtime can return the byte in question.

          H Offline
          H Offline
          hint_54
          wrote on last edited by
          #24

          kakan [[]], a few things on that

          kakan wrote:

          So the runtime does buffer (at least) one sector (or more likely, a cluster).

          Can't you be more precise on which of those does the runtime buffers at a single shot? A sector or a cluster? and What if I read more than just a sector/cluster, will it buffer the necessary sectors/clusters with a single operation or will it take the same amount of time it would if I read the 2, 3, or more sectors/clusters on different operations? thx!

          K 1 Reply Last reply
          0
          • K kakan

            Hello. I would suggest you to: 1. Get the cluster size of the disk you are using. Then create a buffer of that size. Do all reads and writes (if possible) with the cluster size. 2. Turn of stack checking, at least for the functions you use most frequently. 3. Do not, repeat NOT, use time (and cpu) consuming functions in your code. Especially avoid using the (x)printf functions at all times. It's incredibly time and cpu consuming! Another question: You say there shouldn't be any limitations to the file size. You are aware that the f-funcs has a file size limit of about 4 GB? If you want to avoid that limitaion, the you got to use the real Win32-functions CreateFile, ReadFile etc. If you decide to use them instead, then you have the possibility to use overlapped I/O, which might speed up the file I/O. Else, if you stay with the f-funcs, then considder to use the open(), read(), write(), close() e.t.c. They are closer to the file system than the f-funcs (not much, but its worth trying). Kakan

            H Offline
            H Offline
            hint_54
            wrote on last edited by
            #25

            Hi there!

            kakan wrote:

            I would suggest you to: 1. Get the cluster size of the disk you are using. Then create a buffer of that size. Do all reads and writes (if possible) with the cluster size. 2. Turn of stack checking, at least for the functions you use most frequently. 3. Do not, repeat NOT, use time (and cpu) consuming functions in your code. Especially avoid using the (x)printf functions at all times. It's incredibly time and cpu consuming!

            By the same order ;) 1. Ok with that. 2. How do i disable the stack checking? 3. Also ok with that, i'm not using them. :) I have also noted that you advise the use of CreateFile, ReadFile, etc instead of f-functions because of theyr limitation. Does that limitation also apply for the open(), read(), write (and so on) functions? And which are faster: Win32-functions or the DOS open/read/open.. ones? Thx! hint_54

            K 1 Reply Last reply
            0
            • H hint_54

              Hi there!

              kakan wrote:

              I would suggest you to: 1. Get the cluster size of the disk you are using. Then create a buffer of that size. Do all reads and writes (if possible) with the cluster size. 2. Turn of stack checking, at least for the functions you use most frequently. 3. Do not, repeat NOT, use time (and cpu) consuming functions in your code. Especially avoid using the (x)printf functions at all times. It's incredibly time and cpu consuming!

              By the same order ;) 1. Ok with that. 2. How do i disable the stack checking? 3. Also ok with that, i'm not using them. :) I have also noted that you advise the use of CreateFile, ReadFile, etc instead of f-functions because of theyr limitation. Does that limitation also apply for the open(), read(), write (and so on) functions? And which are faster: Win32-functions or the DOS open/read/open.. ones? Thx! hint_54

              K Offline
              K Offline
              kakan
              wrote on last edited by
              #26

              Hello and good morning. About the stack check, here is a snippet: #pragma check_stack(off) /* Funcs that are called often... */ char * _fastcall CWrTapeTh::w32fgets(char *string, int n) { .... } #pragma check_stack(on) The 4 GB limitation applies to all the old file handling funcs, bot the f-funcs (fopen, fwrite, ...) and open, write. The reason for this limitation is a 32-bit value (unsigned long, I think), that holds the actual position in the file. And that counter wraps at (approx) 4 GB. The Win32-funcs doesn't have that limit of file size. Which one is fastest? To be honest, I don't know, really. But the Win32-funcs are the only way to go if you want to be able to handle files of any size. My guess is that the Win32-funcs can be really fast. Besides, (as I said in my earlier post), the Win32-funcs can use overlapped I/O, which means that you can have several read/writes going on at the same time. Just try to write a file to a diskette with the f-funcs. Get the time for it. Then write the same file to the hard drive. Now, copy that file to the diskette. Get the time for the copy. Compare the times. You will see a remarkable difference. Why? I'm not 100% sure, but my guess is that Windows copy uses overlapped I/O. I know there is samples of overlapped file I/O at MSDN. Maybe I should dig deeper in this and post an article at CP? :) Kakan

              1 Reply Last reply
              0
              • H hint_54

                kakan [[]], a few things on that

                kakan wrote:

                So the runtime does buffer (at least) one sector (or more likely, a cluster).

                Can't you be more precise on which of those does the runtime buffers at a single shot? A sector or a cluster? and What if I read more than just a sector/cluster, will it buffer the necessary sectors/clusters with a single operation or will it take the same amount of time it would if I read the 2, 3, or more sectors/clusters on different operations? thx!

                K Offline
                K Offline
                kakan
                wrote on last edited by
                #27

                Hello. I'm a bit on thin ice here. For CreateFile, you can set how the file will be acessed. I think MS calls it a "hint" for the file system. See the docs for CreateFile and all of the FILE_FLAG_-flags. It's quite informative. Kakan

                1 Reply Last reply
                0
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • World
                • Users
                • Groups