BULK data import
-
Hello I have a task coming up shortly which involves loading several BILLION rows of data from a text file into SQL 2000. Wondering what the fastest way to do this is eg SQL BULK INSERT or code it up in C# executable etc. Does anyone know any tips or tricks?
-
Hello I have a task coming up shortly which involves loading several BILLION rows of data from a text file into SQL 2000. Wondering what the fastest way to do this is eg SQL BULK INSERT or code it up in C# executable etc. Does anyone know any tips or tricks?
From one text file? Is it at least CSV or fixed-width? The SQL Server bulk copy utility is
bcp
, but I wouldn't use it for such an extensive import, I'd use a program (written in C#, but that's just me). I would have the program commit and checkpoint its position in the file once in a while (10000 rows?) so it can restart near where it left off after a failure. With bcp you'd probably have to start all over. In my database library routines I also have the ability to make a log entry every so many inserts or seconds so I can watch the log and be sure it's actually still progressing, and get a feel for how long it's taking. With bcp you're flying blind. I do occasionally use bcp for small imports, and it usually involves importing to a sort of "raw" table from whence I can then copy the data to the proper places, especially if data from each row has to be put into several tables. -
From one text file? Is it at least CSV or fixed-width? The SQL Server bulk copy utility is
bcp
, but I wouldn't use it for such an extensive import, I'd use a program (written in C#, but that's just me). I would have the program commit and checkpoint its position in the file once in a while (10000 rows?) so it can restart near where it left off after a failure. With bcp you'd probably have to start all over. In my database library routines I also have the ability to make a log entry every so many inserts or seconds so I can watch the log and be sure it's actually still progressing, and get a feel for how long it's taking. With bcp you're flying blind. I do occasionally use bcp for small imports, and it usually involves importing to a sort of "raw" table from whence I can then copy the data to the proper places, especially if data from each row has to be put into several tables.Thanks for the info. Data is in several files. I'll write some C# stuff to analyse it and make sure there are no errors before load. Tried a test on that yesterday with hundred million rows and only took a couple of minutes. Slow point is SQL import. Commit at intervals is a good idea. Testing 1 million rows import at the moment (C# and Stored Procedure) and getting about 7 minutes. Thats only five days processing for a billion records... Might try SqlBulkCopy class next and see whether thats any faster. Thanks -- modified at 1:40 Friday 4th May, 2007 SqlBulkCopy - 14 seconds!