Fast work: OVERLAPPED vs MEMORY MAPPED

Scott H Settlemier

For an i/o bound processing job, which would be faster: 1) Memory map the file and process it in place 2) Use overlapped I/O, reading into one of 2 buffers, while processing the other (pipelined) On a single cpu, would I get any benefit from #2? On a dual cpu, does #1 get no improvement over that on a single cpu? Thanks.

Rick York

If you are truly I/O-bound then I suspect that overlapped I/O would be fastest since it would allow for I/O to happen simultaneously with data processing. You might need to adjust your buffering scheme to take maximum advantage. One option is a buffer/thread pool where some (or one) buffers can be read, some processed, and some written, all at once. __________________________________________ a two cent stamp short of going postal.

cmk

Scott H. Settlemier wrote: For an i/o bound processing job, which would be faster: 1) Memory map the file and process it in place 2) Use overlapped I/O, reading into one of 2 buffers, while processing the other (pipelined) For me it would really depend on the nature of the processing and the app. Scott H. Settlemier wrote: On a single cpu, would I get any benefit from #2? Other than having code that would scale to hyperthreaded and multi processor hardware well ... probably not. [edit] DOH! i read it as IOCP, see peters reply for more appropriate reply [/edit] Scott H. Settlemier wrote: On a dual cpu, does #1 get no improvement over that on a single cpu? The file is on one disk and is being read by one thread, i would say no (not directly, but you would indirectly). Each method has its strengths, it really depends on the requirements. ...cmk Save the whales - collect the whole set

peterchen

A Memory Mapped File can avoid the cost of copying data around (file cache <==> your buffers), this is especially helpful with large amounts of data 2) Overlapped I/O - if supported by the hardware, but any decent box should now - works directly between disk controller and RAM, without requiring the CPU. Which one is better strongly depends on your application, and the box it runs on. Share some more info (what kind of I/O, how often, how much data, etc.)

we are here to help each other get through this thing, whatever it is Vonnegut jr.
sighist Fold With Us! || Agile Programming | doxygen

Scott H Settlemier

It's a generic class for all instances of work that's i/o bound. Specific uses are filters and hashes which take a score or so of cycles per byte of data. (much less than time to read the data) I'm allocating the buffers on the stack and using overlapped i/o but wasn't sure if the operating system was actually able to asynchronously read the data or if just spawned another thread to do it. If it's just another thread, for a single processor, it'd seem better to just use a memory mapped file right? No overhead for the wait operations and processor is fully tied up reading (on each page fault) or processing. If overlapped i/o really is asynchronous on a single processor, then what I've got now should be optimal. Also I wonder, if the OS has the capability for real asynchronous reads on a single processor, would it be using that to help page in from a memory mapped file? Maybe the memory mapped file can match the overlapped i/o technique because the os is smart? (opened with sequential access for clue? I doubt but wonder.) (yeah buffer sizes are calculated for optimal size-- you have to pass working block size, processing cost and wait cost (about 10K cycles I found from experiments on XP) for the overlapped i/o-- it can be shown that optimal # of rounds is independent of read cost assuming that it's greater than processing cost.) Thanks!