Fastest way to pull data out of MS Access DB
-
Currently i'm pulling out data by using this code:
odCommand = New OleDbCommand("select top 5000 * from TPers, TComp where tpers.compkey = tcomp.compkey", odConnection)
odReader = odCommand.ExecuteReader
While odReader.Read
InsertDBIndexData(odReader("persid").ToString, odReader("cpr").ToString, odReader("firstname").ToString & " " & odReader("lastname").ToString, odReader("legalname").ToString, odReader("tpers.address1").ToString)
End While
odReader.Close()this takes 5 minutes and 5000 people is indexed in my LuceneDB. But the AccessDB contains more than 100.000 people, so the faster the better. I also tried to index the whole DB, but with this code it's never finished. Don't know why. What's the fastest and best way to pull data from an AccessDB?
-
Currently i'm pulling out data by using this code:
odCommand = New OleDbCommand("select top 5000 * from TPers, TComp where tpers.compkey = tcomp.compkey", odConnection)
odReader = odCommand.ExecuteReader
While odReader.Read
InsertDBIndexData(odReader("persid").ToString, odReader("cpr").ToString, odReader("firstname").ToString & " " & odReader("lastname").ToString, odReader("legalname").ToString, odReader("tpers.address1").ToString)
End While
odReader.Close()this takes 5 minutes and 5000 people is indexed in my LuceneDB. But the AccessDB contains more than 100.000 people, so the faster the better. I also tried to index the whole DB, but with this code it's never finished. Don't know why. What's the fastest and best way to pull data from an AccessDB?
I doubt Access is as fast as SQL Server, perhaps you've outgrown access ?
Christian Graus - Microsoft MVP - C++ "also I don't think "TranslateOneToTwoBillion OneHundredAndFortySevenMillion FourHundredAndEightyThreeThousand SixHundredAndFortySeven()" is a very good choice for a function name" - SpacixOne ( offering help to someone who really needed it ) ( spaces added for the benefit of people running at < 1280x1024 )
-
I doubt Access is as fast as SQL Server, perhaps you've outgrown access ?
Christian Graus - Microsoft MVP - C++ "also I don't think "TranslateOneToTwoBillion OneHundredAndFortySevenMillion FourHundredAndEightyThreeThousand SixHundredAndFortySeven()" is a very good choice for a function name" - SpacixOne ( offering help to someone who really needed it ) ( spaces added for the benefit of people running at < 1280x1024 )
I'm afraid i have no choice but to stick with Access. But you're certainly right though. There's no way to get that data any faster than using the method i've already posted? Or just a better method that wont crash the program if it's running for too long.
-
I'm afraid i have no choice but to stick with Access. But you're certainly right though. There's no way to get that data any faster than using the method i've already posted? Or just a better method that wont crash the program if it's running for too long.
Not really. Why are you using the TOP modifier in the SQL? You're not doing any ordering, so it looks like you're just using it to limit the results returned, which could be ANY 5000 of the record set returned constrained by the WHERE clause. The biggest part of the problem may not be the records returned, but by the code running in
InsertDBIndexData
. If it's adding data to a sorted collection, the data will be added slower and slower as the size of the collection increases.A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007 -
Not really. Why are you using the TOP modifier in the SQL? You're not doing any ordering, so it looks like you're just using it to limit the results returned, which could be ANY 5000 of the record set returned constrained by the WHERE clause. The biggest part of the problem may not be the records returned, but by the code running in
InsertDBIndexData
. If it's adding data to a sorted collection, the data will be added slower and slower as the size of the collection increases.A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007Thanks for your reply. I'm not sorting anything, only adding it to a Lucene.Net indexing database. This is how the code looks:
Private Sub InsertDBIndexData(ByVal idbID As Integer, ByVal sCPR As String, ByVal sName As String, ByVal sCompany As String, ByVal sAddress As String, ByVal sStatus As String, ByVal sType As String)
'New person in the index
Dim writer As New Lucene.Net.Index.IndexWriter(DBIndexLocation, New StandardAnalyzer, False)
Dim doc As New Document
Dim f1 As New Field("dbID", idbID, Field.Store.YES, Field.Index.NO)
Dim f2 As New Field("CPR", sCPR, Field.Store.YES, Field.Index.TOKENIZED)
Dim f3 As New Field("Name", sName, Field.Store.YES, Field.Index.TOKENIZED)
Dim f4 As New Field("Company", sCompany, Field.Store.YES, Field.Index.TOKENIZED)
Dim f5 As New Field("Address", sAddress, Field.Store.YES, Field.Index.TOKENIZED)
Dim f6 As New Field("Status", sStatus, Field.Store.YES, Field.Index.NO)
Dim f7 As New Field("Type", sType, Field.Store.YES, Field.Index.NO)
doc.Add(f1)
doc.Add(f2)
doc.Add(f3)
doc.Add(f4)
doc.Add(f5)
doc.Add(f6)
doc.Add(f7)
writer.AddDocument(doc)
writer.Optimize()
writer.Close()
End SubPrivate Sub newIndex()
Dim odConnection As oleDbConnection = new oleDbConnection("......db.mdb")
odConnection.Open()
Dim writer As New Lucene.Net.Index.IndexWriter(DBIndexLocation, New StandardAnalyzer, True)
writer.Close()odCommand = New OleDbCommand("select \* from Customers", odConnection) odReader = odCommand.ExecuteReader While odReader.Read 'InsertDBIndexData(odReader("id"), odReader("cpr"), odReader("firstname") & " " & odReader("lastname"), odReader("legalname")) InsertDBIndexData(odReader("persid").ToString, odReader("cpr").ToString, odReader("firstname").ToString & " " & odReader("lastname").ToString, odReader("legalname").ToString, odReader("address1").ToString, odReader("Status").ToString, odReader("InsType").ToString) End While odReader.Close() End Sub
The only optimizing i can figure out here, is to only open the (Lucene.Net)writer once. And when i'm indexing it also takes up ALOT of resources. Is there any way i can limit that? Would it be better to first drag everything out of the DB, place it in an ArrayList and when that's done, start moving it to the Lucene
-
Thanks for your reply. I'm not sorting anything, only adding it to a Lucene.Net indexing database. This is how the code looks:
Private Sub InsertDBIndexData(ByVal idbID As Integer, ByVal sCPR As String, ByVal sName As String, ByVal sCompany As String, ByVal sAddress As String, ByVal sStatus As String, ByVal sType As String)
'New person in the index
Dim writer As New Lucene.Net.Index.IndexWriter(DBIndexLocation, New StandardAnalyzer, False)
Dim doc As New Document
Dim f1 As New Field("dbID", idbID, Field.Store.YES, Field.Index.NO)
Dim f2 As New Field("CPR", sCPR, Field.Store.YES, Field.Index.TOKENIZED)
Dim f3 As New Field("Name", sName, Field.Store.YES, Field.Index.TOKENIZED)
Dim f4 As New Field("Company", sCompany, Field.Store.YES, Field.Index.TOKENIZED)
Dim f5 As New Field("Address", sAddress, Field.Store.YES, Field.Index.TOKENIZED)
Dim f6 As New Field("Status", sStatus, Field.Store.YES, Field.Index.NO)
Dim f7 As New Field("Type", sType, Field.Store.YES, Field.Index.NO)
doc.Add(f1)
doc.Add(f2)
doc.Add(f3)
doc.Add(f4)
doc.Add(f5)
doc.Add(f6)
doc.Add(f7)
writer.AddDocument(doc)
writer.Optimize()
writer.Close()
End SubPrivate Sub newIndex()
Dim odConnection As oleDbConnection = new oleDbConnection("......db.mdb")
odConnection.Open()
Dim writer As New Lucene.Net.Index.IndexWriter(DBIndexLocation, New StandardAnalyzer, True)
writer.Close()odCommand = New OleDbCommand("select \* from Customers", odConnection) odReader = odCommand.ExecuteReader While odReader.Read 'InsertDBIndexData(odReader("id"), odReader("cpr"), odReader("firstname") & " " & odReader("lastname"), odReader("legalname")) InsertDBIndexData(odReader("persid").ToString, odReader("cpr").ToString, odReader("firstname").ToString & " " & odReader("lastname").ToString, odReader("legalname").ToString, odReader("address1").ToString, odReader("Status").ToString, odReader("InsType").ToString) End While odReader.Close() End Sub
The only optimizing i can figure out here, is to only open the (Lucene.Net)writer once. And when i'm indexing it also takes up ALOT of resources. Is there any way i can limit that? Would it be better to first drag everything out of the DB, place it in an ArrayList and when that's done, start moving it to the Lucene
Jan Sommer wrote:
writer.AddDocument(doc) writer.Optimize()
I get the feeling that this is where your code spends most of it's time. Just on these two lines. You can use the ANTS profiler, or CompuWare's, to find out how much time it takes to execute these instructions.
Jan Sommer wrote:
And when i'm indexing it also takes up ALOT of resources. Is there any way i can limit that?
No, there isn't.
A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007 -
Jan Sommer wrote:
writer.AddDocument(doc) writer.Optimize()
I get the feeling that this is where your code spends most of it's time. Just on these two lines. You can use the ANTS profiler, or CompuWare's, to find out how much time it takes to execute these instructions.
Jan Sommer wrote:
And when i'm indexing it also takes up ALOT of resources. Is there any way i can limit that?
No, there isn't.
A guide to posting questions on CodeProject[^]
Dave Kreskowiak Microsoft MVP Visual Developer - Visual Basic
2006, 2007The writer.optimize() part certainly took alot of time. I removed it and placed it after the while odreader.read, and that saved me 1/3 of the time it took to index the db. But it still isn't fast enough. Currently i'm indexing the DB and when i check the taskmanager it says something like: Process: IndexService.exe, CPU: 00 (sometimes 01) and memory: between 100.000 and 800.000 Why is it taking up all the memory and not using the CPU? The indexing program is soon going to run on a server with a bit more ram. I hope that will speed up things a little. Will it be any faster if i somehow convinced the IT-department to switch to MSSQL?