Storing Huge volume of data for processing...
-
Ok, thanks to some help, I think I have what I need to extract the data I need from a HUGE (900Mb-2Gb) text file. Now I need to store the huge volume of data so that I can process it. Its only 5 fields, but lots of entries. I will need to group the data based upon various fields so I can bin and plot it. I will also need to do things like MIN/Max/Average of one field for groups of another field. So the question is what type of data structure do I put it into? Any suggestions?
David Wilkes
-
Ok, thanks to some help, I think I have what I need to extract the data I need from a HUGE (900Mb-2Gb) text file. Now I need to store the huge volume of data so that I can process it. Its only 5 fields, but lots of entries. I will need to group the data based upon various fields so I can bin and plot it. I will also need to do things like MIN/Max/Average of one field for groups of another field. So the question is what type of data structure do I put it into? Any suggestions?
David Wilkes
amatbrewer wrote:
So the question is what type of data structure do I put it into? Any suggestions?
A database. It is designed specifically to hold and process large volumes of data like this.
Upcoming events: * Glasgow Geek Dinner (5th March) * Glasgow: Tell us what you want to see in 2007 My: Website | Blog | Photos
-
amatbrewer wrote:
So the question is what type of data structure do I put it into? Any suggestions?
A database. It is designed specifically to hold and process large volumes of data like this.
Upcoming events: * Glasgow Geek Dinner (5th March) * Glasgow: Tell us what you want to see in 2007 My: Website | Blog | Photos
-
Ok, thanks to some help, I think I have what I need to extract the data I need from a HUGE (900Mb-2Gb) text file. Now I need to store the huge volume of data so that I can process it. Its only 5 fields, but lots of entries. I will need to group the data based upon various fields so I can bin and plot it. I will also need to do things like MIN/Max/Average of one field for groups of another field. So the question is what type of data structure do I put it into? Any suggestions?
David Wilkes
Ie every time your app runs the data is different I wouldn't recommend a database. When parsing the national do not call registry I wrote a simple binary search to traverse the massive mulit-gig file to return entries by area code. Putting it into SQL Server took 2 days. Based on the need and my knowledge of algorithms no db was a better answer. Of course depending on how complicated you get will depend on when a db becomes a valid choice.
File Not Found
-
Ie every time your app runs the data is different I wouldn't recommend a database. When parsing the national do not call registry I wrote a simple binary search to traverse the massive mulit-gig file to return entries by area code. Putting it into SQL Server took 2 days. Based on the need and my knowledge of algorithms no db was a better answer. Of course depending on how complicated you get will depend on when a db becomes a valid choice.
File Not Found
Ennis Ray Lynch, Jr. wrote:
Ie every time your app runs the data is different I wouldn't recommend a database.
:omg: Thanks!!! I had not thought of that part. The data will be 100% new each time. I basically need short term data storage that I can build and access quickly, and that is gone once I am done. We are only talkign about 6 fields (all numbers with one being in Octial format). But the quantity of data is Huge! I am working on creating an object that will read the file, parse out the data I need, and store it so I can perform functions on the data. This is an entirely new area for me and I am open to any ideas.
David Wilkes
-
Ennis Ray Lynch, Jr. wrote:
Ie every time your app runs the data is different I wouldn't recommend a database.
:omg: Thanks!!! I had not thought of that part. The data will be 100% new each time. I basically need short term data storage that I can build and access quickly, and that is gone once I am done. We are only talkign about 6 fields (all numbers with one being in Octial format). But the quantity of data is Huge! I am working on creating an object that will read the file, parse out the data I need, and store it so I can perform functions on the data. This is an entirely new area for me and I am open to any ideas.
David Wilkes
Use streams and only persist the results to permanent storage. You may have to fiddle if you havent taken a good Data Structures and Algorithms class but it can be done reasonably fast.
File Not Found