C# io question
-
I have to write an application theat reads big files performs some processing on there data like XOR on a some section in it each section is a few maga bytes and then save them to A new file waht should i do in order to achive max performance? shoul i read a few sections eaach time and process and append them to the new file or one section each time? there are several threads that do this proces on several different files at the same time? sample code will be great help Thnks
-
I have to write an application theat reads big files performs some processing on there data like XOR on a some section in it each section is a few maga bytes and then save them to A new file waht should i do in order to achive max performance? shoul i read a few sections eaach time and process and append them to the new file or one section each time? there are several threads that do this proces on several different files at the same time? sample code will be great help Thnks
poqeqw wrote:
waht should i do in order to achive max performance?
The limiting factor here is probably the drive speed. Getting the data into RAM is what is going to take the longest. simply XORing the data is insignificant in comparison to how long it will take to load/save the data, so don't mess around with separate threads for processing the data, you'll be wasting your time. you can use
System.IO.File.ReadAllBytes(@"C:\myfile");
to get the whole file. andSystem.IO.File.WriteAllBytes(@"c;\myfile", thedata);
to save the processed data. I'd start by doing it like this. Stick some timers in a time how long the load takes, how long the save take and how long the processing takes. There is nothing you can do to speed up the load/save (apart from buy faster drives). However, if the processing time is significant, you could start the processing while still loading the file. You could do this by doing the read on a seperate thread like this:System.IO.FileStream myStream = new System.IO.FileStream(@"c;\myfile", System.IO.FileAccess.Read);
myStream.BeginRead(...);and allowing the main thread to start processing while the read is still taking place. You'll have to put checks in place to make sure the main thread doesn't get ahead of the data being read. (Don't bother using threads to do several files at the same time, the drive will only thrash about and read time will actually be slower. focus on doing them sequentially. Remember the limiting factor here is almost certainly the drive access. Unless of course they are on separate drives, or you are doing some really heavy cpu intensive processing)
Simon