Increase performance of csv
-
Hi, I have csv file which contains 1 million records and with column 160. I have to insert(update the csv file) more 60 columns in the same file so the total column finally will generated 220. 60 columns and there data will get update like - for first Row I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data. Then I have to parse that data and fill 60 columns for that perticular row. Like same I have to do for 2 row and 3row....upto 1 million. I have to process this file within 3 hrs. How I can improve the performance of the reading and writing to csv file. Regards, sjs
-
Hi, I have csv file which contains 1 million records and with column 160. I have to insert(update the csv file) more 60 columns in the same file so the total column finally will generated 220. 60 columns and there data will get update like - for first Row I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data. Then I have to parse that data and fill 60 columns for that perticular row. Like same I have to do for 2 row and 3row....upto 1 million. I have to process this file within 3 hrs. How I can improve the performance of the reading and writing to csv file. Regards, sjs
Sionce we know nothing of your code, it's prretty much impossible to tell you how to improve the performance.
A guide to posting questions on CodeProject[^]
Dave Kreskowiak -
Hi, I have csv file which contains 1 million records and with column 160. I have to insert(update the csv file) more 60 columns in the same file so the total column finally will generated 220. 60 columns and there data will get update like - for first Row I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data. Then I have to parse that data and fill 60 columns for that perticular row. Like same I have to do for 2 row and 3row....upto 1 million. I have to process this file within 3 hrs. How I can improve the performance of the reading and writing to csv file. Regards, sjs
Here, on CP, Sebastien Lorion's very popular 2011 article, and code, "A Fast CSV Reader," immediately comes to mind: [^]. I think you'll find a good strategy for optimizing access to your file in that article. But, given: "I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data:" does this mean you are writing XML into your CSV file ? ... edit ... you might also examine the open-source file library, FileHelpers: [^]. I have not used this library.
If you seek to aid everyone that suffers in the galaxy, you will only weaken yourself … and weaken them. It is the internal struggles, when fought and won on their own, that yield the strongest rewards… If you care for others, then dispense with pity and sacrifice and recognize the value in letting them fight their own battles." Darth Traya
-
Hi, I have csv file which contains 1 million records and with column 160. I have to insert(update the csv file) more 60 columns in the same file so the total column finally will generated 220. 60 columns and there data will get update like - for first Row I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data. Then I have to parse that data and fill 60 columns for that perticular row. Like same I have to do for 2 row and 3row....upto 1 million. I have to process this file within 3 hrs. How I can improve the performance of the reading and writing to csv file. Regards, sjs
1 million records isn't so much. The real advantage is you can spawn multiple threads each with a different start point all writing to a different output file then have another app combine the outputs. But if the file was only a few GB I would just load the entire thing into ram in one quick blit and get it over with.
Need custom software developed? I do custom programming based primarily on MS tools with an emphasis on C# development and consulting. "And they, since they Were not the one dead, turned to their affairs" -- Robert Frost "All users always want Excel" --Ennis Lynch
-
Hi, I have csv file which contains 1 million records and with column 160. I have to insert(update the csv file) more 60 columns in the same file so the total column finally will generated 220. 60 columns and there data will get update like - for first Row I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data. Then I have to parse that data and fill 60 columns for that perticular row. Like same I have to do for 2 row and 3row....upto 1 million. I have to process this file within 3 hrs. How I can improve the performance of the reading and writing to csv file. Regards, sjs
You already posted this question two days ago: http://www.codeproject.com/Messages/4723449/Get-and-Update-the-CSV-File-cells.aspx[^] Please don't repost the same question.
"These people looked deep within my soul and assigned me a number based on the order in which I joined." - Homer
-
Hi, I have csv file which contains 1 million records and with column 160. I have to insert(update the csv file) more 60 columns in the same file so the total column finally will generated 220. 60 columns and there data will get update like - for first Row I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data. Then I have to parse that data and fill 60 columns for that perticular row. Like same I have to do for 2 row and 3row....upto 1 million. I have to process this file within 3 hrs. How I can improve the performance of the reading and writing to csv file. Regards, sjs
sjs4u wrote:
I have to process this file within 3 hrs.
Sorry, but that's not how it works. You can't take an arbitrary proces and demand it's being done within a certain time-frame :) CSV-files aren't meant to be read or manipulated "fast"; use a database if speed is important.
sjs4u wrote:
How I can improve the performance of the reading and writing to csv file.
Divide the workload over multiple PC's - then again, we don't know if that's even possible; depends on the structure of the file.
Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^]
-
Hi, I have csv file which contains 1 million records and with column 160. I have to insert(update the csv file) more 60 columns in the same file so the total column finally will generated 220. 60 columns and there data will get update like - for first Row I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data. Then I have to parse that data and fill 60 columns for that perticular row. Like same I have to do for 2 row and 3row....upto 1 million. I have to process this file within 3 hrs. How I can improve the performance of the reading and writing to csv file. Regards, sjs
I suspect the problem is not reading/writing the CSV but the manipulation of the records you have to do. Try breaking the operation into blocks. Read the file, store into a database, process the records and update the database, write out the results. Then identify the slowest operation and work on that.
Never underestimate the power of human stupidity RAH
-
Hi, I have csv file which contains 1 million records and with column 160. I have to insert(update the csv file) more 60 columns in the same file so the total column finally will generated 220. 60 columns and there data will get update like - for first Row I have to select values from column 2,3,4 and generate the url and execute the url so that I will get xml data. Then I have to parse that data and fill 60 columns for that perticular row. Like same I have to do for 2 row and 3row....upto 1 million. I have to process this file within 3 hrs. How I can improve the performance of the reading and writing to csv file. Regards, sjs
sjs4u wrote:
1 million records and ...and execute the url so
That is your bottle neck. If you need to make 1 million entirely different requests that is going to be a problem. If in fact the requests are duplicates then you can cache the results the first time and then use the cached data after that. Other than that the url request can be put into a thread, and within reason, you can be waiting on a number of those at one time before proceeding.