HELP -- An alternative to the getline function
-
I'm running the code shown below, and it's just too slow. I'm reading 17k rows and 59 columns from a CSV file and it takes 3 minutes. My computer is not the fastest (5 year old Celeron processor) but still... I got some great suggestions yesterday: One person mentioned that I should read larger chunks of data instead of one character at a time. Problem is, I don't know how.. Can anybody help? Note: rawData and record are STL vectors. rawData.reserve(numberRecords) ; int counter=0 ; do { counter += 1 ; std::vector record ; record.reserve(numberVars) ; for(int i=0; igetline(buff,sizeof(buff),',') ; record.push_back(buff) ; } std::getline(*data,value) ; record.push_back(value) ; rawData.push_back(record) ; std::vector < std::string >::iterator j ; j=record.begin() ; record.erase(j, j+numberVars) ; } while (counter < numberRecords) ; Thanks, Hamlet
-
I'm running the code shown below, and it's just too slow. I'm reading 17k rows and 59 columns from a CSV file and it takes 3 minutes. My computer is not the fastest (5 year old Celeron processor) but still... I got some great suggestions yesterday: One person mentioned that I should read larger chunks of data instead of one character at a time. Problem is, I don't know how.. Can anybody help? Note: rawData and record are STL vectors. rawData.reserve(numberRecords) ; int counter=0 ; do { counter += 1 ; std::vector record ; record.reserve(numberVars) ; for(int i=0; igetline(buff,sizeof(buff),',') ; record.push_back(buff) ; } std::getline(*data,value) ; record.push_back(value) ; rawData.push_back(record) ; std::vector < std::string >::iterator j ; j=record.begin() ; record.erase(j, j+numberVars) ; } while (counter < numberRecords) ; Thanks, Hamlet
Alright buddy, u can use the
read
function to read a buffer into memory and then read byte by byte from there, check if u r reading past the buffer and if so, then only reread
from the file the next chunk. And if u r really serious, u can try to implement these in a class like this:class CMyFile : public ifstream
{
void *m_pvBuffer;
unsigned long m_ulBufferSize; // may initialize this with 4096
unsigned long m_ulSeekPosition;public:
CMyFile(void);
~CMyFile(void);
CMyFile& operator>>(string& param_string);
};use the
read
function, interpreting of buffer and various checks in theopertor>>
function. the above is only a general idea and by no means a complete solution. "Do first things first, and second things not at all." — Peter Drucker. -
I'm running the code shown below, and it's just too slow. I'm reading 17k rows and 59 columns from a CSV file and it takes 3 minutes. My computer is not the fastest (5 year old Celeron processor) but still... I got some great suggestions yesterday: One person mentioned that I should read larger chunks of data instead of one character at a time. Problem is, I don't know how.. Can anybody help? Note: rawData and record are STL vectors. rawData.reserve(numberRecords) ; int counter=0 ; do { counter += 1 ; std::vector record ; record.reserve(numberVars) ; for(int i=0; igetline(buff,sizeof(buff),',') ; record.push_back(buff) ; } std::getline(*data,value) ; record.push_back(value) ; rawData.push_back(record) ; std::vector < std::string >::iterator j ; j=record.begin() ; record.erase(j, j+numberVars) ; } while (counter < numberRecords) ; Thanks, Hamlet
let's say i read a chunk wih read() and then parse the chunk for commas and end-of-line separators. what happens if the chunk stops between commas or lines? hey do i really have to spoon feed u?
{
...
// read and parse
...if(.../*end of buffer reached && last character read!= terminator*/ )
{
...
//re read buffer from file
//read and parse again
}no more questions please... "Do first things first, and second things not at all." — Peter Drucker.