Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C / C++ / MFC
  4. HDD access chaotic speed problem.

HDD access chaotic speed problem.

Scheduled Pinned Locked Moved C / C++ / MFC
performancequestionc++algorithmshelp
12 Posts 5 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    P Offline
    progDes
    wrote on last edited by
    #1

    Hi, Using C++, std::fstream, Windows. I have an algorithm that processes large file (actually this is the series of algorithms, but lets consider only one, cause others having the same problems). In general this algorithm reads chunks of data from one file and write to the other file with during some realignment (For reference, I'm doing realignment of the 3d volume data). I'm reading chunk size is 512 bytes, writing chunk size is 16 kb. Usually this algorithm finishes in 1 minute and 50 seconds. But I noticed that sometimes (rarely) it finishes in 24 seconds! Processing the same file, the same execution path. I've started to search for the reasons of why this slowdown is happening and how can I control that. 1) I have tried increasing the coalescing of the accesses to disk 2) I have considered the fragmentation problem (the file that I write is wrote in small chunks, therefore it's fragmented, about 600 fragments). When I resolved fragmentation problem (so, it's guaranteed that file is not fragmented) - I didnt got anything, still this chaotic access speed. 3) I have investigated the probability, that Windows flashes my memory buffers to HDD. No, that's not the case. 4) I have found that if I do all this operation on the other physical disk (not the one with OS) - I get this slow down more rarely, and algorithm usually finishes in 40 seconds (but anyway, speed is of HDD access is chaotic). Frequently during the same execution, access speed is rising or falling down, may be few times. This looks like my accesses are going out of tact with some internal HDD or OS operations, don't know. Anyone, have some experience or idea? Thanks.

    L D A N 4 Replies Last reply
    0
    • P progDes

      Hi, Using C++, std::fstream, Windows. I have an algorithm that processes large file (actually this is the series of algorithms, but lets consider only one, cause others having the same problems). In general this algorithm reads chunks of data from one file and write to the other file with during some realignment (For reference, I'm doing realignment of the 3d volume data). I'm reading chunk size is 512 bytes, writing chunk size is 16 kb. Usually this algorithm finishes in 1 minute and 50 seconds. But I noticed that sometimes (rarely) it finishes in 24 seconds! Processing the same file, the same execution path. I've started to search for the reasons of why this slowdown is happening and how can I control that. 1) I have tried increasing the coalescing of the accesses to disk 2) I have considered the fragmentation problem (the file that I write is wrote in small chunks, therefore it's fragmented, about 600 fragments). When I resolved fragmentation problem (so, it's guaranteed that file is not fragmented) - I didnt got anything, still this chaotic access speed. 3) I have investigated the probability, that Windows flashes my memory buffers to HDD. No, that's not the case. 4) I have found that if I do all this operation on the other physical disk (not the one with OS) - I get this slow down more rarely, and algorithm usually finishes in 40 seconds (but anyway, speed is of HDD access is chaotic). Frequently during the same execution, access speed is rising or falling down, may be few times. This looks like my accesses are going out of tact with some internal HDD or OS operations, don't know. Anyone, have some experience or idea? Thanks.

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      If you read and write data from the same disk from different files, the read-head will have to move to a different place on the disk, which is slow. If your input and output file are on different drives, this problem is solved. If you read and write alternatively (to the same physical disk) in your program each time, this slow down is maximum. Note that some variation in disk speed can't be prevented in a multitasking OS where other programs might be using the disk too. If your memory permits, you could consider reading the entire input file to memory at once at the beginning of your algorithm and then processing it and writing the results.

      P 2 Replies Last reply
      0
      • L Lost User

        If you read and write data from the same disk from different files, the read-head will have to move to a different place on the disk, which is slow. If your input and output file are on different drives, this problem is solved. If you read and write alternatively (to the same physical disk) in your program each time, this slow down is maximum. Note that some variation in disk speed can't be prevented in a multitasking OS where other programs might be using the disk too. If your memory permits, you could consider reading the entire input file to memory at once at the beginning of your algorithm and then processing it and writing the results.

        P Offline
        P Offline
        progDes
        wrote on last edited by
        #3

        My files are always one the same drive in all tests. When it does in 1:54 or 24 seconds.

        Thaddeus Jones wrote:

        Note that some variation in disk speed can't be prevented in a multitasking OS where other programs might be using the disk too.

        I always ensure no other work is running. Also I'm looking in the "Resource monitor", no big HDD accesses except my program.

        Thaddeus Jones wrote:

        If your memory permits, you could consider reading the entire input file to memory at once at the beginning of your algorithm and then processing it and writing the results.

        No this is not possible, algorithm should work with unlimited file size.

        L 1 Reply Last reply
        0
        • L Lost User

          If you read and write data from the same disk from different files, the read-head will have to move to a different place on the disk, which is slow. If your input and output file are on different drives, this problem is solved. If you read and write alternatively (to the same physical disk) in your program each time, this slow down is maximum. Note that some variation in disk speed can't be prevented in a multitasking OS where other programs might be using the disk too. If your memory permits, you could consider reading the entire input file to memory at once at the beginning of your algorithm and then processing it and writing the results.

          P Offline
          P Offline
          progDes
          wrote on last edited by
          #4

          The problem is not "Algorithm is slow", but "Algorithm speed is too chaotic". Sometimes it's done 5 times faster then usual, this means - it should always be done 5 times faster. I agree, that access speed can vary a little, but 5 times... I think this is something that should be puzzled out.

          1 Reply Last reply
          0
          • P progDes

            Hi, Using C++, std::fstream, Windows. I have an algorithm that processes large file (actually this is the series of algorithms, but lets consider only one, cause others having the same problems). In general this algorithm reads chunks of data from one file and write to the other file with during some realignment (For reference, I'm doing realignment of the 3d volume data). I'm reading chunk size is 512 bytes, writing chunk size is 16 kb. Usually this algorithm finishes in 1 minute and 50 seconds. But I noticed that sometimes (rarely) it finishes in 24 seconds! Processing the same file, the same execution path. I've started to search for the reasons of why this slowdown is happening and how can I control that. 1) I have tried increasing the coalescing of the accesses to disk 2) I have considered the fragmentation problem (the file that I write is wrote in small chunks, therefore it's fragmented, about 600 fragments). When I resolved fragmentation problem (so, it's guaranteed that file is not fragmented) - I didnt got anything, still this chaotic access speed. 3) I have investigated the probability, that Windows flashes my memory buffers to HDD. No, that's not the case. 4) I have found that if I do all this operation on the other physical disk (not the one with OS) - I get this slow down more rarely, and algorithm usually finishes in 40 seconds (but anyway, speed is of HDD access is chaotic). Frequently during the same execution, access speed is rising or falling down, may be few times. This looks like my accesses are going out of tact with some internal HDD or OS operations, don't know. Anyone, have some experience or idea? Thanks.

            D Offline
            D Offline
            David Crow
            wrote on last edited by
            #5

            progDes wrote:

            Usually this algorithm finishes in 1 minute and 50 seconds. But I noticed that sometimes (rarely) it finishes in 24 seconds! Processing the same file, the same execution path. I've started to search for the reasons of why this slowdown is happening and how can I control that.

            Caching, perhaps?

            "One man's wage rise is another man's price increase." - Harold Wilson

            "Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons

            "Man who follows car will be exhausted." - Confucius

            1 Reply Last reply
            0
            • P progDes

              My files are always one the same drive in all tests. When it does in 1:54 or 24 seconds.

              Thaddeus Jones wrote:

              Note that some variation in disk speed can't be prevented in a multitasking OS where other programs might be using the disk too.

              I always ensure no other work is running. Also I'm looking in the "Resource monitor", no big HDD accesses except my program.

              Thaddeus Jones wrote:

              If your memory permits, you could consider reading the entire input file to memory at once at the beginning of your algorithm and then processing it and writing the results.

              No this is not possible, algorithm should work with unlimited file size.

              L Offline
              L Offline
              Lost User
              wrote on last edited by
              #6

              Maybe you could increase your input buffer then from 512 bytes to say 100Mb, and every time you've processed the 100Mb from memory, you'll read a new 100Mb. Similarly, writing your output to a memory buffer (say also 100Mb) and once your buffer is full writing that to file, should help with speed too. The idea is to concentrate disk access to areas on the disk that are near eachother, since those access operations are much faster than if the head has to be repositioned.

              P 1 Reply Last reply
              0
              • P progDes

                Hi, Using C++, std::fstream, Windows. I have an algorithm that processes large file (actually this is the series of algorithms, but lets consider only one, cause others having the same problems). In general this algorithm reads chunks of data from one file and write to the other file with during some realignment (For reference, I'm doing realignment of the 3d volume data). I'm reading chunk size is 512 bytes, writing chunk size is 16 kb. Usually this algorithm finishes in 1 minute and 50 seconds. But I noticed that sometimes (rarely) it finishes in 24 seconds! Processing the same file, the same execution path. I've started to search for the reasons of why this slowdown is happening and how can I control that. 1) I have tried increasing the coalescing of the accesses to disk 2) I have considered the fragmentation problem (the file that I write is wrote in small chunks, therefore it's fragmented, about 600 fragments). When I resolved fragmentation problem (so, it's guaranteed that file is not fragmented) - I didnt got anything, still this chaotic access speed. 3) I have investigated the probability, that Windows flashes my memory buffers to HDD. No, that's not the case. 4) I have found that if I do all this operation on the other physical disk (not the one with OS) - I get this slow down more rarely, and algorithm usually finishes in 40 seconds (but anyway, speed is of HDD access is chaotic). Frequently during the same execution, access speed is rising or falling down, may be few times. This looks like my accesses are going out of tact with some internal HDD or OS operations, don't know. Anyone, have some experience or idea? Thanks.

                A Offline
                A Offline
                Andrew Brock
                wrote on last edited by
                #7

                There are countless factors that could be contributing to this. I would guess that the major ones are file caching and delayed writes[^]. File Caching: When a file is opened and read, its contents are loaded into RAM by the OS, and then sections are copied to your exe as you need them (usually in 4KB chunks which you then read smaller chunks from). Once the file is closed it is marked as unused but the OS, but is not removed from the RAM. If then another program needs heaps of memory the file will be removed from the RAM, however if this does not happen, and your file is still in the RAM then your program doesn't actually need to use the hard disk. This is most notable if you open a program that uses lots of files on load, say MS Word. If you then close the program and open it again shortly after without opening something else the 2nd time you open the program it will load much quicker. Delayed Writes: These generally only occur to slow mediums such as USB memory sticks, however they can happen to HDD as well. When you write to a file, and storage device is busy the data you write will often get written into a virtual file in RAM which is then written to the storage device at a later stage. Other problems may include HDD head seeks (as mentioned by Thaddeus Jones) and other programs accessing the disk. If you are running Windows vista or 7 then you can look at disk accesses with the Resource Monitor (resmon.exe)

                P 1 Reply Last reply
                0
                • L Lost User

                  Maybe you could increase your input buffer then from 512 bytes to say 100Mb, and every time you've processed the 100Mb from memory, you'll read a new 100Mb. Similarly, writing your output to a memory buffer (say also 100Mb) and once your buffer is full writing that to file, should help with speed too. The idea is to concentrate disk access to areas on the disk that are near eachother, since those access operations are much faster than if the head has to be repositioned.

                  P Offline
                  P Offline
                  progDes
                  wrote on last edited by
                  #8

                  Thaddeus Jones wrote:

                  Maybe you could increase your input buffer then from 512 bytes to say 100Mb, and every time you've processed the 100Mb from memory, you'll read a new 100Mb. Similarly, writing your output to a memory buffer (say also 100Mb) and once your buffer is full writing that to file, should help with speed too.

                  I've tried this approach. Although it gives slight speed increase, it doesnt resolve the problem of chaotic speed.

                  L 1 Reply Last reply
                  0
                  • P progDes

                    Thaddeus Jones wrote:

                    Maybe you could increase your input buffer then from 512 bytes to say 100Mb, and every time you've processed the 100Mb from memory, you'll read a new 100Mb. Similarly, writing your output to a memory buffer (say also 100Mb) and once your buffer is full writing that to file, should help with speed too.

                    I've tried this approach. Although it gives slight speed increase, it doesnt resolve the problem of chaotic speed.

                    L Offline
                    L Offline
                    Lost User
                    wrote on last edited by
                    #9

                    I'm afraid I'm out of ideas then :)

                    1 Reply Last reply
                    0
                    • A Andrew Brock

                      There are countless factors that could be contributing to this. I would guess that the major ones are file caching and delayed writes[^]. File Caching: When a file is opened and read, its contents are loaded into RAM by the OS, and then sections are copied to your exe as you need them (usually in 4KB chunks which you then read smaller chunks from). Once the file is closed it is marked as unused but the OS, but is not removed from the RAM. If then another program needs heaps of memory the file will be removed from the RAM, however if this does not happen, and your file is still in the RAM then your program doesn't actually need to use the hard disk. This is most notable if you open a program that uses lots of files on load, say MS Word. If you then close the program and open it again shortly after without opening something else the 2nd time you open the program it will load much quicker. Delayed Writes: These generally only occur to slow mediums such as USB memory sticks, however they can happen to HDD as well. When you write to a file, and storage device is busy the data you write will often get written into a virtual file in RAM which is then written to the storage device at a later stage. Other problems may include HDD head seeks (as mentioned by Thaddeus Jones) and other programs accessing the disk. If you are running Windows vista or 7 then you can look at disk accesses with the Resource Monitor (resmon.exe)

                      P Offline
                      P Offline
                      progDes
                      wrote on last edited by
                      #10

                      Thanks Adrew, I will consider the situation with file caching. Need to investigate on this more. Meanwhile, are you think that rare disk accesses by other programs can reduce speed of my accesses in 5 times? I'm making sure that no heavy HDD operations are performed by other programs, but other programs for sure doing disk accesses even in the idle mode.

                      1 Reply Last reply
                      0
                      • P progDes

                        Hi, Using C++, std::fstream, Windows. I have an algorithm that processes large file (actually this is the series of algorithms, but lets consider only one, cause others having the same problems). In general this algorithm reads chunks of data from one file and write to the other file with during some realignment (For reference, I'm doing realignment of the 3d volume data). I'm reading chunk size is 512 bytes, writing chunk size is 16 kb. Usually this algorithm finishes in 1 minute and 50 seconds. But I noticed that sometimes (rarely) it finishes in 24 seconds! Processing the same file, the same execution path. I've started to search for the reasons of why this slowdown is happening and how can I control that. 1) I have tried increasing the coalescing of the accesses to disk 2) I have considered the fragmentation problem (the file that I write is wrote in small chunks, therefore it's fragmented, about 600 fragments). When I resolved fragmentation problem (so, it's guaranteed that file is not fragmented) - I didnt got anything, still this chaotic access speed. 3) I have investigated the probability, that Windows flashes my memory buffers to HDD. No, that's not the case. 4) I have found that if I do all this operation on the other physical disk (not the one with OS) - I get this slow down more rarely, and algorithm usually finishes in 40 seconds (but anyway, speed is of HDD access is chaotic). Frequently during the same execution, access speed is rising or falling down, may be few times. This looks like my accesses are going out of tact with some internal HDD or OS operations, don't know. Anyone, have some experience or idea? Thanks.

                        N Offline
                        N Offline
                        Niklas L
                        wrote on last edited by
                        #11

                        If you run out of ideas, have you tried disabling anti-virus software? Maybe it performs some weird caching of scanned data. Even a long shot is a shot...

                        home

                        P 1 Reply Last reply
                        0
                        • N Niklas L

                          If you run out of ideas, have you tried disabling anti-virus software? Maybe it performs some weird caching of scanned data. Even a long shot is a shot...

                          home

                          P Offline
                          P Offline
                          progDes
                          wrote on last edited by
                          #12

                          Yes, I tried. Well, actually I think this is file caching problem. Seems like caching is not always work well in my case. Will try to disable it and do some caching on my own.

                          1 Reply Last reply
                          0
                          Reply
                          • Reply as topic
                          Log in to reply
                          • Oldest to Newest
                          • Newest to Oldest
                          • Most Votes


                          • Login

                          • Don't have an account? Register

                          • Login or register to search.
                          • First post
                            Last post
                          0
                          • Categories
                          • Recent
                          • Tags
                          • Popular
                          • World
                          • Users
                          • Groups