Large file handling using MFC
-
Hi, I'm working on an application where it loads a huge file (like 4-6GB) and parses the file, searching for regular expressions, converting from Hex to Human readable form, filters etc. Since the file is of very large size, I'm facing lot of performance issues where it takes hours to process and show the file information in a window. To overcome this, what I did is, break the entire file into logical segments of 1 MB each and process 3 segments at a time. When the user is scrolling, the next segments will be processed. With this, a certain amount of improvements were done. But it is limited to files which are less than 300MB or so. For files which are over GBs, it is just crawling and pathetic. Wondering if there is any better way to handle these large files in a better way so that the user need not wait for hours? Will a buffered read and caching the decoded information in a temporary files using a thread would help? How can we make sure that the GUI doesn't hang during these intensive parsing process? Really appreciate your suggestions. Thanks Vikas
-
Hi, I'm working on an application where it loads a huge file (like 4-6GB) and parses the file, searching for regular expressions, converting from Hex to Human readable form, filters etc. Since the file is of very large size, I'm facing lot of performance issues where it takes hours to process and show the file information in a window. To overcome this, what I did is, break the entire file into logical segments of 1 MB each and process 3 segments at a time. When the user is scrolling, the next segments will be processed. With this, a certain amount of improvements were done. But it is limited to files which are less than 300MB or so. For files which are over GBs, it is just crawling and pathetic. Wondering if there is any better way to handle these large files in a better way so that the user need not wait for hours? Will a buffered read and caching the decoded information in a temporary files using a thread would help? How can we make sure that the GUI doesn't hang during these intensive parsing process? Really appreciate your suggestions. Thanks Vikas
-
Hi, I'm working on an application where it loads a huge file (like 4-6GB) and parses the file, searching for regular expressions, converting from Hex to Human readable form, filters etc. Since the file is of very large size, I'm facing lot of performance issues where it takes hours to process and show the file information in a window. To overcome this, what I did is, break the entire file into logical segments of 1 MB each and process 3 segments at a time. When the user is scrolling, the next segments will be processed. With this, a certain amount of improvements were done. But it is limited to files which are less than 300MB or so. For files which are over GBs, it is just crawling and pathetic. Wondering if there is any better way to handle these large files in a better way so that the user need not wait for hours? Will a buffered read and caching the decoded information in a temporary files using a thread would help? How can we make sure that the GUI doesn't hang during these intensive parsing process? Really appreciate your suggestions. Thanks Vikas
You could do memory mapping for the file segments. You should get hold of a good profiler and check where the actual bottle neck is.
«_Superman_» I love work. It gives me something to do between weekends.
-
Hi, I'm working on an application where it loads a huge file (like 4-6GB) and parses the file, searching for regular expressions, converting from Hex to Human readable form, filters etc. Since the file is of very large size, I'm facing lot of performance issues where it takes hours to process and show the file information in a window. To overcome this, what I did is, break the entire file into logical segments of 1 MB each and process 3 segments at a time. When the user is scrolling, the next segments will be processed. With this, a certain amount of improvements were done. But it is limited to files which are less than 300MB or so. For files which are over GBs, it is just crawling and pathetic. Wondering if there is any better way to handle these large files in a better way so that the user need not wait for hours? Will a buffered read and caching the decoded information in a temporary files using a thread would help? How can we make sure that the GUI doesn't hang during these intensive parsing process? Really appreciate your suggestions. Thanks Vikas
-
Hi, I'm working on an application where it loads a huge file (like 4-6GB) and parses the file, searching for regular expressions, converting from Hex to Human readable form, filters etc. Since the file is of very large size, I'm facing lot of performance issues where it takes hours to process and show the file information in a window. To overcome this, what I did is, break the entire file into logical segments of 1 MB each and process 3 segments at a time. When the user is scrolling, the next segments will be processed. With this, a certain amount of improvements were done. But it is limited to files which are less than 300MB or so. For files which are over GBs, it is just crawling and pathetic. Wondering if there is any better way to handle these large files in a better way so that the user need not wait for hours? Will a buffered read and caching the decoded information in a temporary files using a thread would help? How can we make sure that the GUI doesn't hang during these intensive parsing process? Really appreciate your suggestions. Thanks Vikas
I've had pretty good performance with memory mapped files (I've only used files up to about 1GB). What you need to do is create a file mapping with CreateFileMapping and then use MapViewOfFile to map segments of the file. I worked with 16MB windows on the file, which actually meant I mapped 32MB at a time, 16MB before and after the nominal file offset, so that negative random accesses from the file offset didn't fault over into the previous window - something like the diagram below:
+----------------+----------------+
| | |
| | |
| | |
+----------------+----------------+
^ ^ ^
| | |
-16MB Requested +16MB
offset into fileI packaged this up into a MemoryMappedFile class which had an associated randomly-access iterator class (like a vector has an iterator). The memory mapped file could only be accessed using the iterator, which was exposed using 'begin' and 'end' methods of the memory mapped file class. On a reasonable workstation (2.4GHz Core 2 Duo), I could traverse the file at around 100MB/second when counting data packets whose size was roughly 200 bytes.
Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p
-
I've had pretty good performance with memory mapped files (I've only used files up to about 1GB). What you need to do is create a file mapping with CreateFileMapping and then use MapViewOfFile to map segments of the file. I worked with 16MB windows on the file, which actually meant I mapped 32MB at a time, 16MB before and after the nominal file offset, so that negative random accesses from the file offset didn't fault over into the previous window - something like the diagram below:
+----------------+----------------+
| | |
| | |
| | |
+----------------+----------------+
^ ^ ^
| | |
-16MB Requested +16MB
offset into fileI packaged this up into a MemoryMappedFile class which had an associated randomly-access iterator class (like a vector has an iterator). The memory mapped file could only be accessed using the iterator, which was exposed using 'begin' and 'end' methods of the memory mapped file class. On a reasonable workstation (2.4GHz Core 2 Duo), I could traverse the file at around 100MB/second when counting data packets whose size was roughly 200 bytes.
Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p