Parsing Tab Delimited File
-
is there any way faster then using split()? the lines are all the exact same format, with no nulls, etc this is what i am using: while (html_sr.Peek() >=0) { txt_file = html_sr.ReadLine(); user_info = txt_file.Split('\t'); SQLAction.InsertUpdateUser(int.Parse(user_info[5]),user_info[2],int.Parse(user_info[4]),double.Parse(user_info[3]),date_modified,int.Parse(user_info[1]),int.Parse(user_info[0])); }
-
is there any way faster then using split()? the lines are all the exact same format, with no nulls, etc this is what i am using: while (html_sr.Peek() >=0) { txt_file = html_sr.ReadLine(); user_info = txt_file.Split('\t'); SQLAction.InsertUpdateUser(int.Parse(user_info[5]),user_info[2],int.Parse(user_info[4]),double.Parse(user_info[3]),date_modified,int.Parse(user_info[1]),int.Parse(user_info[0])); }
The first obviouse thing is:
while (html_sr.Peek() >=0)
{
txt_file = html_sr.ReadLine();Why not use:
while((txt_file = html_sr.ReadLine()) != null)
{
// Do stuff...
}Now, you no longer have to perform the peek operation and your test is against a value you would get anyway. (Since the RealLine method already performs some operation to determine whether it is at the EOF why duplicate the effort)
EuroCPian Spring 2004 Get Together[^] "You can have everything in life you want if you will just help enough other people get what they want." --Zig Ziglar "Get in touch with your Inner Capitalist - I wish you much success!" -- Christopher Duncan, Lounge 9-Feb-2004
-
The first obviouse thing is:
while (html_sr.Peek() >=0)
{
txt_file = html_sr.ReadLine();Why not use:
while((txt_file = html_sr.ReadLine()) != null)
{
// Do stuff...
}Now, you no longer have to perform the peek operation and your test is against a value you would get anyway. (Since the RealLine method already performs some operation to determine whether it is at the EOF why duplicate the effort)
EuroCPian Spring 2004 Get Together[^] "You can have everything in life you want if you will just help enough other people get what they want." --Zig Ziglar "Get in touch with your Inner Capitalist - I wish you much success!" -- Christopher Duncan, Lounge 9-Feb-2004
Thanks! i had thought that Readline would throw an exception :) now to test and see if this will provide any performance improvement though i suppose most of my problem is the SQL method... right now it proceses on average 170 of those lines per second, just trying to squash it down as much as possible
-
Thanks! i had thought that Readline would throw an exception :) now to test and see if this will provide any performance improvement though i suppose most of my problem is the SQL method... right now it proceses on average 170 of those lines per second, just trying to squash it down as much as possible
-
You should probably use some profiler to see what takes most time. It probably is sql query. Make sure you're not opening the connection with each query, but open it once and do all the queries then.
thanks... the query is actually a single stored procedure, but a complex one
-
Thanks! i had thought that Readline would throw an exception :) now to test and see if this will provide any performance improvement though i suppose most of my problem is the SQL method... right now it proceses on average 170 of those lines per second, just trying to squash it down as much as possible
If you're doing INSERTs on a SQL Server from a comma-delimited file, use BCP (Bulk Copy). It can do up to 100,000 lines/second on a big machine and on desktop machines it processes up to 10,000 lines/second. Perl combines all the worst aspects of C and Lisp: a billion different sublanguages in one monolithic executable. It combines the power of C with the readability of PostScript. -- Jamie Zawinski
-
If you're doing INSERTs on a SQL Server from a comma-delimited file, use BCP (Bulk Copy). It can do up to 100,000 lines/second on a big machine and on desktop machines it processes up to 10,000 lines/second. Perl combines all the worst aspects of C and Lisp: a billion different sublanguages in one monolithic executable. It combines the power of C with the readability of PostScript. -- Jamie Zawinski
unfortunatly the text file needs to be "processed"