re: Fast Searching...
-
Hi Here is the Problem: To access a Datafile which contains 26 Million addresses and find a given address. The Datafile is an ascii text file and is fixed What I have come up with: 1. Sort the 26 Million addresses into Postcode order. (I have application for this) 2. Using FSEEK look at the MIDDLE of the file and compare using STRNCMP the POSTCODE for searching, against the POSTCODE within the Datafile. 3. Using the result move the file pointer backwards from the middle (half it) or forwards and keep checking until a match is found. This will loop for roughly 24-25 times and not 26 Million. Is there a better way?
-
Hi Here is the Problem: To access a Datafile which contains 26 Million addresses and find a given address. The Datafile is an ascii text file and is fixed What I have come up with: 1. Sort the 26 Million addresses into Postcode order. (I have application for this) 2. Using FSEEK look at the MIDDLE of the file and compare using STRNCMP the POSTCODE for searching, against the POSTCODE within the Datafile. 3. Using the result move the file pointer backwards from the middle (half it) or forwards and keep checking until a match is found. This will loop for roughly 24-25 times and not 26 Million. Is there a better way?
I would recommend converting the text file and using a binary tree for faster searching. I can think of two libraries you could possibly use to accomplish this: QDBM http://qdbm.sourceforge.net/[^] SQLITE http://www.sqlite.org/[^] Best Wishes, -David Delaune
-
I would recommend converting the text file and using a binary tree for faster searching. I can think of two libraries you could possibly use to accomplish this: QDBM http://qdbm.sourceforge.net/[^] SQLITE http://www.sqlite.org/[^] Best Wishes, -David Delaune
-
Hi Here is the Problem: To access a Datafile which contains 26 Million addresses and find a given address. The Datafile is an ascii text file and is fixed What I have come up with: 1. Sort the 26 Million addresses into Postcode order. (I have application for this) 2. Using FSEEK look at the MIDDLE of the file and compare using STRNCMP the POSTCODE for searching, against the POSTCODE within the Datafile. 3. Using the result move the file pointer backwards from the middle (half it) or forwards and keep checking until a match is found. This will loop for roughly 24-25 times and not 26 Million. Is there a better way?
Probably the easiest way to speed searching is to divide the big file into N smaller files, maybe organized by a partial postcode. Then, after you have selected the correct (smaller) file, the search will be that much faster. This involves a minimum or presort setup.
Best wishes, Hans
[CodeProject Forum Guidelines] [How To Ask A Question] [My Articles]
-
Probably the easiest way to speed searching is to divide the big file into N smaller files, maybe organized by a partial postcode. Then, after you have selected the correct (smaller) file, the search will be that much faster. This involves a minimum or presort setup.
Best wishes, Hans
[CodeProject Forum Guidelines] [How To Ask A Question] [My Articles]