Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. Writing/Reading Large files

Writing/Reading Large files

Scheduled Pinned Locked Moved C / C++ / MFC
c++performancehelpquestioncareer
9 Posts 6 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J Offline
    J Offline
    Jenleonard
    wrote on last edited by
    #1

    I am debugging a Visual C++ program, and my job is to speed up the file access. Currently the program is reading in a 600 MB ascii-text file using fopen and fscanf, etc., in text mode. Is there an easy way to speed this up? I have tried using streams and they aren't defined. Would changing it to binary make it faster? Thanks so much for your help! Jennifer

    C A K 3 Replies Last reply
    0
    • J Jenleonard

      I am debugging a Visual C++ program, and my job is to speed up the file access. Currently the program is reading in a 600 MB ascii-text file using fopen and fscanf, etc., in text mode. Is there an easy way to speed this up? I have tried using streams and they aren't defined. Would changing it to binary make it faster? Thanks so much for your help! Jennifer

      C Offline
      C Offline
      Christian Graus
      wrote on last edited by
      #2

      JenniferLeonard522 wrote: I have tried using streams and they aren't defined. How do you mean, they aren't defined ? Regardless of speed issues, moving your file handling code to c++ is a positive step. Christian Graus - Microsoft MVP - C++

      A 1 Reply Last reply
      0
      • C Christian Graus

        JenniferLeonard522 wrote: I have tried using streams and they aren't defined. How do you mean, they aren't defined ? Regardless of speed issues, moving your file handling code to c++ is a positive step. Christian Graus - Microsoft MVP - C++

        A Offline
        A Offline
        Anonymous
        wrote on last edited by
        #3

        When I use ifstream and cin they aren't defined. the code I am debugging currently uses fopen, fscanf, fprintf, etc. Thanks! Jen

        C 1 Reply Last reply
        0
        • A Anonymous

          When I use ifstream and cin they aren't defined. the code I am debugging currently uses fopen, fscanf, fprintf, etc. Thanks! Jen

          C Offline
          C Offline
          Christian Graus
          wrote on last edited by
          #4

          Anonymous wrote: When I use ifstream and cin they aren't defined. Did you include them ? ifstream is in fstrem and cin is in iostream. You need to scope them in std, or pull them into the global namespace with using statements as well. Anonymous wrote: the code I am debugging currently uses fopen, fscanf, fprintf, etc. Yeah, the world is full of crappy code that uses those instead of C++. It's a common problem. Christian Graus - Microsoft MVP - C++

          1 Reply Last reply
          0
          • J Jenleonard

            I am debugging a Visual C++ program, and my job is to speed up the file access. Currently the program is reading in a 600 MB ascii-text file using fopen and fscanf, etc., in text mode. Is there an easy way to speed this up? I have tried using streams and they aren't defined. Would changing it to binary make it faster? Thanks so much for your help! Jennifer

            A Offline
            A Offline
            Axter
            wrote on last edited by
            #5

            For reading files of this size, I don't recommend using C++ stream classes. Instead, I recommend you use file mapping API functions, which will greatly increase the speed of your code. For more info, look at your help files for the following API functions: MapViewOfFile CreateFileMapping UnmapViewOfFile For example code, check out the following links: http://code.axter.com/MapFileToMemory.h and http://code.axter.com/mapfile2mem.cpp The C++ stream classes are reliable, but not very speedy. If you have a large file, and you need speed, they're not the best choice. Top ten member of C++ Expert Exchange. http://www.experts-exchange.com/Cplusplus

            1 Reply Last reply
            0
            • J Jenleonard

              I am debugging a Visual C++ program, and my job is to speed up the file access. Currently the program is reading in a 600 MB ascii-text file using fopen and fscanf, etc., in text mode. Is there an easy way to speed this up? I have tried using streams and they aren't defined. Would changing it to binary make it faster? Thanks so much for your help! Jennifer

              K Offline
              K Offline
              kakan
              wrote on last edited by
              #6

              Hello. One way to speed up file handling (while maintaining ths stream I/O) could be to use fread() and fwrite(), using a buffer size that match the disk sector size of the file. Since the normal sector size is 512 bytes, read and write blocks that is an even multiplier of 512. (I guess the most effective disk I/O would be to read/write the size of a complete disk cluster at once, but I'm not sure). This means that you have to use your own code to extract/create the data (text lines) in the buffer, but that job has to take place anyhow. If it´s done in the runtime library or in your own code dosen't really matter (provided that your own code is efficially written). Another way could be to use the native Win32-API and to use overlapped I/O. But it works (logically) in the same way as fread/fwrite, which means you still have to write your own code to handle your buffer.

              J 1 Reply Last reply
              0
              • K kakan

                Hello. One way to speed up file handling (while maintaining ths stream I/O) could be to use fread() and fwrite(), using a buffer size that match the disk sector size of the file. Since the normal sector size is 512 bytes, read and write blocks that is an even multiplier of 512. (I guess the most effective disk I/O would be to read/write the size of a complete disk cluster at once, but I'm not sure). This means that you have to use your own code to extract/create the data (text lines) in the buffer, but that job has to take place anyhow. If it´s done in the runtime library or in your own code dosen't really matter (provided that your own code is efficially written). Another way could be to use the native Win32-API and to use overlapped I/O. But it works (logically) in the same way as fread/fwrite, which means you still have to write your own code to handle your buffer.

                J Offline
                J Offline
                John R Shaw
                wrote on last edited by
                #7

                I rated your answer as 5 (but I do not know if it took). You are correct about reading and writing a cluster at a time (or a multiple there of). The problems may start to appear with how the memory required is managed, which the question ignores (I,m not going into that). The simpilest solution (if they are using MFC), is to use a CMemFile (which side steps the issue entirely and allows MFC to handle it). INTP Every thing is relative...

                J 1 Reply Last reply
                0
                • J John R Shaw

                  I rated your answer as 5 (but I do not know if it took). You are correct about reading and writing a cluster at a time (or a multiple there of). The problems may start to appear with how the memory required is managed, which the question ignores (I,m not going into that). The simpilest solution (if they are using MFC), is to use a CMemFile (which side steps the issue entirely and allows MFC to handle it). INTP Every thing is relative...

                  J Offline
                  J Offline
                  Jenleonard
                  wrote on last edited by
                  #8

                  Thank you for all the suggestions. I looked into CMemFile, but it said " Because CMemFile doesn't use a disk file, the data member CFile::m_hFile is not used and has no meaning." I read through the description and didn't see how to use it to read a large data file from disk. I also looked into writing the file a cluster at a time, but the problem is that my file is constantly growing (every 10 minutes it gets more information). So right now it is 600MB, but in a couple months it will be 800MB. What would the quickest way be to read a large file like this? Thanks! Jen

                  J 1 Reply Last reply
                  0
                  • J Jenleonard

                    Thank you for all the suggestions. I looked into CMemFile, but it said " Because CMemFile doesn't use a disk file, the data member CFile::m_hFile is not used and has no meaning." I read through the description and didn't see how to use it to read a large data file from disk. I also looked into writing the file a cluster at a time, but the problem is that my file is constantly growing (every 10 minutes it gets more information). So right now it is 600MB, but in a couple months it will be 800MB. What would the quickest way be to read a large file like this? Thanks! Jen

                    J Offline
                    J Offline
                    John R Shaw
                    wrote on last edited by
                    #9

                    The first statement that the m_hFile is not used and has no meaning is essentially true. That is the file handle was only needed long enough to load the file into memory. I (incorrectly) assumed that was not the case (blast it). The fasted way to read a file is a cluster at a time. You do not need to actualy care about this detail, since you are only reading the file as a whole. Look into sharing file read access. This will not speed up the actual read time, but will reduce the preceived time (if used properly). What this means, is that (if) the file is only changed by adding to it, then you just need to read the additional information added to the file. What I mensioned above is also true if you do not have file sharing read access. What I mean is that (assuming the other application closes the file, after writing to it) you can check if the file size has changed on a regular basis, and just read the changes. What all that bull boils down to is this: Only read what has changed and nothing more. I hope that helps, because I cann't explane all the ideas that have poped into my head. INTP Every thing is relative...

                    1 Reply Last reply
                    0
                    Reply
                    • Reply as topic
                    Log in to reply
                    • Oldest to Newest
                    • Newest to Oldest
                    • Most Votes


                    • Login

                    • Don't have an account? Register

                    • Login or register to search.
                    • First post
                      Last post
                    0
                    • Categories
                    • Recent
                    • Tags
                    • Popular
                    • World
                    • Users
                    • Groups