Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. Fast files

Fast files

Scheduled Pinned Locked Moved C / C++ / MFC
c++
27 Posts 7 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H Offline
    H Offline
    hint_54
    wrote on last edited by
    #1

    Hi there! I'm writting an app that deals a lot with files (both reading and writting) and that is slowing down my app. What I need is a fast way to do the following operations on files: - read - writte - append - open - close I'm using c++ but inline assembly is an option. Thx!

    C D Q K 4 Replies Last reply
    0
    • H hint_54

      Hi there! I'm writting an app that deals a lot with files (both reading and writting) and that is slowing down my app. What I need is a fast way to do the following operations on files: - read - writte - append - open - close I'm using c++ but inline assembly is an option. Thx!

      C Offline
      C Offline
      cgreathouse
      wrote on last edited by
      #2

      What APIs are you using for File I/O? Have you done any profiling to see where the bottlenecks are? Since File I/O is extremly slow compared to memory I/O, I don't think you'll see much difference between C++ and assembly. Without knowing more about you app, it's difficult to give any advice. Chris

      H 1 Reply Last reply
      0
      • H hint_54

        Hi there! I'm writting an app that deals a lot with files (both reading and writting) and that is slowing down my app. What I need is a fast way to do the following operations on files: - read - writte - append - open - close I'm using c++ but inline assembly is an option. Thx!

        D Offline
        D Offline
        David Crow
        wrote on last edited by
        #3

        A memory-mapped file might save you a small amount of time.


        "Take only what you need and leave the land as you found it." - Native American Proverb

        H 1 Reply Last reply
        0
        • C cgreathouse

          What APIs are you using for File I/O? Have you done any profiling to see where the bottlenecks are? Since File I/O is extremly slow compared to memory I/O, I don't think you'll see much difference between C++ and assembly. Without knowing more about you app, it's difficult to give any advice. Chris

          H Offline
          H Offline
          hint_54
          wrote on last edited by
          #4

          I am writting a file compressing app. But, for now, I'm just improving/optimizing the compressing algorithm that I'm using. This algorithm as to writte a lot of bytes on several files (depending on the original file's size) and I'm using fwrite(), fopen(), fclose(), fseek() and fread() from stdio.h. bottlenecks - I don't known what that is! Is this helpfull? Thx!

          C D 2 Replies Last reply
          0
          • D David Crow

            A memory-mapped file might save you a small amount of time.


            "Take only what you need and leave the land as you found it." - Native American Proverb

            H Offline
            H Offline
            hint_54
            wrote on last edited by
            #5

            hhhmmm... You are talking about DMA rigth? I don't know how to use it with files! Thx!

            M D 2 Replies Last reply
            0
            • H hint_54

              hhhmmm... You are talking about DMA rigth? I don't know how to use it with files! Thx!

              M Offline
              M Offline
              Mathieu Dijkstra
              wrote on last edited by
              #6

              [Message Deleted]

              1 Reply Last reply
              0
              • H hint_54

                I am writting a file compressing app. But, for now, I'm just improving/optimizing the compressing algorithm that I'm using. This algorithm as to writte a lot of bytes on several files (depending on the original file's size) and I'm using fwrite(), fopen(), fclose(), fseek() and fread() from stdio.h. bottlenecks - I don't known what that is! Is this helpfull? Thx!

                C Offline
                C Offline
                cgreathouse
                wrote on last edited by
                #7

                Bottlenecks are spots in your code where most of the execution time is spent. I would suggest you profile you app and see where most of the time is spent. If you're using VC6 there is a profile that ships with it. It takes some reading to figure out how to use it (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_core_using_profile.2c_.prep.2c_.and_plist.asp[^]) There's also a way to run from VS but it doesn't always work(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_core_using_profile.2c_.prep.2c_.and_plist.asp[^]) If you're using VS 2002 or 2003 you'll have to get the profiler from compuware. They have a freebie version available (http://www.compuware.com/products/devpartner/profiler/default.asp?cid=3019X36&focus=DevPartner&source=Web+%2D+Evaluation+Request&desc=Download+%2D+%27DevPartner+Profiler+Community+Edition%27&offering=DevPartner&sf=1&p=0[^]) If you're using VS 2005 there is a profile included. Once you see where most of the time is being spent you can then start to think about how to improve the performance.

                H 1 Reply Last reply
                0
                • H hint_54

                  Hi there! I'm writting an app that deals a lot with files (both reading and writting) and that is slowing down my app. What I need is a fast way to do the following operations on files: - read - writte - append - open - close I'm using c++ but inline assembly is an option. Thx!

                  Q Offline
                  Q Offline
                  QuiJohn
                  wrote on last edited by
                  #8

                  A common beginner's mistake I see when doing file I/O is writing or reading one byte at a time (or a very small number of bytes). You can greatly increase I/O time by writing/reading good sized chunks at a time. Perhaps this is your issue? How big to make the buffers is pretty much application dependant, but even going to say several K at a time can make a huge difference.

                  H K 2 Replies Last reply
                  0
                  • H hint_54

                    hhhmmm... You are talking about DMA rigth? I don't know how to use it with files! Thx!

                    D Offline
                    D Offline
                    David Crow
                    wrote on last edited by
                    #9

                    hint_54 wrote:

                    You are talking about DMA rigth?

                    No. DMA is a way to access memory independently of the CPU.

                    hint_54 wrote:

                    I don't know how to use it with files!

                    A memory-mapped file is a spot in memory that can be accessed (e.g., open, close, read, write, seek) as though it were an actual file. By eliminating file I/O, a speed increase is (normally) realized. See these for examples:* http://www.ecst.csuchico.edu/~beej/guide/ipc/mmap.html

                    • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dngenlib/html/msdn\_manamemo.asp

                    • http://www.codeproject.com/file/xreverse.asp

                    • http://www.codeproject.com/file/findidaddressbook.asp

                    • http://www.codeproject.com/file/Memory\_Mapped\_Class\_\_\_PBD.asp

                    • http://www.codeproject.com/win32/cmemmap.asp


                      "Take only what you need and leave the land as you found it." - Native American Proverb

                    H J 2 Replies Last reply
                    0
                    • H hint_54

                      I am writting a file compressing app. But, for now, I'm just improving/optimizing the compressing algorithm that I'm using. This algorithm as to writte a lot of bytes on several files (depending on the original file's size) and I'm using fwrite(), fopen(), fclose(), fseek() and fread() from stdio.h. bottlenecks - I don't known what that is! Is this helpfull? Thx!

                      D Offline
                      D Offline
                      David Crow
                      wrote on last edited by
                      #10

                      hint_54 wrote:

                      bottlenecks - I don't known what that is!

                      It is a metaphor for trying to cram a bunch of work into a small passage. With a bottle, no matter its capacity, there is a physical limit to the amount of liquid that can pass through the neck (which is usually narrower than the bottle itself).


                      "Take only what you need and leave the land as you found it." - Native American Proverb

                      1 Reply Last reply
                      0
                      • Q QuiJohn

                        A common beginner's mistake I see when doing file I/O is writing or reading one byte at a time (or a very small number of bytes). You can greatly increase I/O time by writing/reading good sized chunks at a time. Perhaps this is your issue? How big to make the buffers is pretty much application dependant, but even going to say several K at a time can make a huge difference.

                        H Offline
                        H Offline
                        hint_54
                        wrote on last edited by
                        #11

                        Actually a have been considering that fact for a reason: As an algorithm restriction, I must deal with data byte by byte in order to accomplish the compression. So I begun wondering which was faster: to read byte by byte or to have to apply shift rights, bit wise orings and andings to extract the data (into a single byte) and then apply the rest of the algorithm on that byte which also includes those bit wise operations. I would guess that the 2nd option is faster because it works only with registers and main memory but there quite a bit of those operations on the algorithm. So my question is: does it really payoff to read memory chunks instead of byte by byte in this particular case? Thx! hint_54

                        D 1 Reply Last reply
                        0
                        • D David Crow

                          hint_54 wrote:

                          You are talking about DMA rigth?

                          No. DMA is a way to access memory independently of the CPU.

                          hint_54 wrote:

                          I don't know how to use it with files!

                          A memory-mapped file is a spot in memory that can be accessed (e.g., open, close, read, write, seek) as though it were an actual file. By eliminating file I/O, a speed increase is (normally) realized. See these for examples:* http://www.ecst.csuchico.edu/~beej/guide/ipc/mmap.html

                          • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dngenlib/html/msdn\_manamemo.asp

                          • http://www.codeproject.com/file/xreverse.asp

                          • http://www.codeproject.com/file/findidaddressbook.asp

                          • http://www.codeproject.com/file/Memory\_Mapped\_Class\_\_\_PBD.asp

                          • http://www.codeproject.com/win32/cmemmap.asp


                            "Take only what you need and leave the land as you found it." - Native American Proverb

                          H Offline
                          H Offline
                          hint_54
                          wrote on last edited by
                          #12

                          Yes, I know what you mean but isn’t that memory consuming for larger files? hint_54

                          D 1 Reply Last reply
                          0
                          • H hint_54

                            Yes, I know what you mean but isn’t that memory consuming for larger files? hint_54

                            D Offline
                            D Offline
                            David Crow
                            wrote on last edited by
                            #13

                            hint_54 wrote:

                            ...isn’t that memory consuming for larger files?

                            I guess that depends on your definition of "larger files." I wouldn't think twice about using MMF for files several MB in size. In any case, you can restrict the mapping to only the portion of the file you are interested in.


                            "Take only what you need and leave the land as you found it." - Native American Proverb

                            H 1 Reply Last reply
                            0
                            • D David Crow

                              hint_54 wrote:

                              ...isn’t that memory consuming for larger files?

                              I guess that depends on your definition of "larger files." I wouldn't think twice about using MMF for files several MB in size. In any case, you can restrict the mapping to only the portion of the file you are interested in.


                              "Take only what you need and leave the land as you found it." - Native American Proverb

                              H Offline
                              H Offline
                              hint_54
                              wrote on last edited by
                              #14

                              There is no definition for larger files because this app is meant to be used by someone else, meaning that I can never know the maximum file size to be used (can be anything! And limiting that size is not an option.) and I can’t load portions of the file because the algorithm reads the file sequentially, woks on data (byte by byte) and stores the output way. But thx anyway ;) hint_54

                              J 1 Reply Last reply
                              0
                              • H hint_54

                                Actually a have been considering that fact for a reason: As an algorithm restriction, I must deal with data byte by byte in order to accomplish the compression. So I begun wondering which was faster: to read byte by byte or to have to apply shift rights, bit wise orings and andings to extract the data (into a single byte) and then apply the rest of the algorithm on that byte which also includes those bit wise operations. I would guess that the 2nd option is faster because it works only with registers and main memory but there quite a bit of those operations on the algorithm. So my question is: does it really payoff to read memory chunks instead of byte by byte in this particular case? Thx! hint_54

                                D Offline
                                D Offline
                                David Crow
                                wrote on last edited by
                                #15

                                You can read the file in large chunks and process that chunk byte by byte. Even when cached, doing disk I/O one byte at a time is painfully slow.


                                "Take only what you need and leave the land as you found it." - Native American Proverb

                                H 1 Reply Last reply
                                0
                                • C cgreathouse

                                  Bottlenecks are spots in your code where most of the execution time is spent. I would suggest you profile you app and see where most of the time is spent. If you're using VC6 there is a profile that ships with it. It takes some reading to figure out how to use it (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_core_using_profile.2c_.prep.2c_.and_plist.asp[^]) There's also a way to run from VS but it doesn't always work(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_core_using_profile.2c_.prep.2c_.and_plist.asp[^]) If you're using VS 2002 or 2003 you'll have to get the profiler from compuware. They have a freebie version available (http://www.compuware.com/products/devpartner/profiler/default.asp?cid=3019X36&focus=DevPartner&source=Web+%2D+Evaluation+Request&desc=Download+%2D+%27DevPartner+Profiler+Community+Edition%27&offering=DevPartner&sf=1&p=0[^]) If you're using VS 2005 there is a profile included. Once you see where most of the time is being spent you can then start to think about how to improve the performance.

                                  H Offline
                                  H Offline
                                  hint_54
                                  wrote on last edited by
                                  #16

                                  My app spends most of its time reading from/writing to files. This is the main reason why I need a fast way to read/write files as the rest of the algorithm isn’t much time consuming, I’d say 85% of the time spent in the compressing algorithm is with files. I need a fast way to read/write files in sequence. thx. hint_54

                                  1 Reply Last reply
                                  0
                                  • D David Crow

                                    You can read the file in large chunks and process that chunk byte by byte. Even when cached, doing disk I/O one byte at a time is painfully slow.


                                    "Take only what you need and leave the land as you found it." - Native American Proverb

                                    H Offline
                                    H Offline
                                    hint_54
                                    wrote on last edited by
                                    #17

                                    hmmm... I see! Thx a lot :-D hint_54

                                    1 Reply Last reply
                                    0
                                    • D David Crow

                                      hint_54 wrote:

                                      You are talking about DMA rigth?

                                      No. DMA is a way to access memory independently of the CPU.

                                      hint_54 wrote:

                                      I don't know how to use it with files!

                                      A memory-mapped file is a spot in memory that can be accessed (e.g., open, close, read, write, seek) as though it were an actual file. By eliminating file I/O, a speed increase is (normally) realized. See these for examples:* http://www.ecst.csuchico.edu/~beej/guide/ipc/mmap.html

                                      • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dngenlib/html/msdn\_manamemo.asp

                                      • http://www.codeproject.com/file/xreverse.asp

                                      • http://www.codeproject.com/file/findidaddressbook.asp

                                      • http://www.codeproject.com/file/Memory\_Mapped\_Class\_\_\_PBD.asp

                                      • http://www.codeproject.com/win32/cmemmap.asp


                                        "Take only what you need and leave the land as you found it." - Native American Proverb

                                      J Offline
                                      J Offline
                                      James R Twine
                                      wrote on last edited by
                                      #18

                                      DavidCrow wrote:

                                      A memory-mapped file is a spot in memory that can be accessed (e.g., open, close, read, write, seek) as though it were an actual file.

                                      Sorry, but I think that is supposed to be the other way around.  Memory-mapping something allows it to be accessed like normal memory (through a pointer).  For example, if you had a memory mapped buttons/switches in your hardware (common on arcade games), you would be able to read the state of the switches by reading values from one or more specific memory addresses.    The same goes for files.  If you memory-map a file, you get an address that points to some location in the file, and you can then access the contents of that file through the pointer.    Peace! -=- James


                                      If you think it costs a lot to do it right, just wait until you find out how much it costs to do it wrong!
                                      Tip for new SUV drivers: Professional Driver on Closed Course does not mean your Dumb Ass on a Public Road!
                                      DeleteFXPFiles & CheckFavorites (Please rate this post!)

                                      D 1 Reply Last reply
                                      0
                                      • H hint_54

                                        There is no definition for larger files because this app is meant to be used by someone else, meaning that I can never know the maximum file size to be used (can be anything! And limiting that size is not an option.) and I can’t load portions of the file because the algorithm reads the file sequentially, woks on data (byte by byte) and stores the output way. But thx anyway ;) hint_54

                                        J Offline
                                        J Offline
                                        James R Twine
                                        wrote on last edited by
                                        #19

                                        Using MMF can still improve performance, because you do not have to do the manual loading of data into a buffer - the OS basically does it for you.  For example, if you were reading the file in 4KB chunks, you would be allocating a 4K buffer, copying from the file into that buffer, and then likely processing the contents of the buffer using the buffer's address.  Using a MMF does that work for you.    If you want to impose a limit on the size of the MMF section you want to create, that is fine.  Choose a limit, say 2MB (or 4Mb, or 64MB, whatever).  If the file is 2MB or smaller, you can MM the entire file.  Of not, you can MM 2MB sections of the file one at a time.    Peace! -=- James


                                        If you think it costs a lot to do it right, just wait until you find out how much it costs to do it wrong!
                                        Tip for new SUV drivers: Professional Driver on Closed Course does not mean your Dumb Ass on a Public Road!
                                        DeleteFXPFiles & CheckFavorites (Please rate this post!)

                                        H 1 Reply Last reply
                                        0
                                        • Q QuiJohn

                                          A common beginner's mistake I see when doing file I/O is writing or reading one byte at a time (or a very small number of bytes). You can greatly increase I/O time by writing/reading good sized chunks at a time. Perhaps this is your issue? How big to make the buffers is pretty much application dependant, but even going to say several K at a time can make a huge difference.

                                          K Offline
                                          K Offline
                                          kakan
                                          wrote on last edited by
                                          #20

                                          Actually file I/O is never done on a byte level. The least bit of information that can be read from ,or written to a file is one sector. A sector is always an even multiple of 128 bytes, the most common sector size is 512 bytes. So the runtime does buffer (at least) one sector (or more likely, a cluster). I think the main reason for slowing down file reading a byte at the time, is the function call overhead and all the checks that has to be done, in the runtime, before the runtime can return the byte in question.

                                          H 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups