Fast keyword Search through CSortedArray?? anyone?
-
Hi There, I have a virtual listview that I load up anywhere from 20,000 to 200,000 records into. The virtual listview handles this no problem at all. I am using the CSortedArray by PJ Naughter class to store a array of custom objects. I can sort the array very fast no problems. My issue occurs when I have to do a keyword search. I have to be able to allow the user to search as they type into a text box. So the user starts off with a FULL list of ALL the items appearing in the virtual listview, as the type keys into the textbox the virtual listview starts drilling down and only showing items that have the letters being typed in the name. so simple example: List ==== Greg John Steve George If the user types G in the text box the list will now look like below List ==== Greg George I have this working using a simple for loop to loop through all the items at each keystroke and remove any items from the list that do not match the keys typed. However, when my list gets to 30,000 or higher the search begins to take up WAY too much CPU. Is there some faster way I can do search like this using the CSortedArry? Thanks Greg
-
Hi There, I have a virtual listview that I load up anywhere from 20,000 to 200,000 records into. The virtual listview handles this no problem at all. I am using the CSortedArray by PJ Naughter class to store a array of custom objects. I can sort the array very fast no problems. My issue occurs when I have to do a keyword search. I have to be able to allow the user to search as they type into a text box. So the user starts off with a FULL list of ALL the items appearing in the virtual listview, as the type keys into the textbox the virtual listview starts drilling down and only showing items that have the letters being typed in the name. so simple example: List ==== Greg John Steve George If the user types G in the text box the list will now look like below List ==== Greg George I have this working using a simple for loop to loop through all the items at each keystroke and remove any items from the list that do not match the keys typed. However, when my list gets to 30,000 or higher the search begins to take up WAY too much CPU. Is there some faster way I can do search like this using the CSortedArry? Thanks Greg
Hi You can use a Thread to process a binary search. The when search finish you can select or show the item(s) David
-
Hi You can use a Thread to process a binary search. The when search finish you can select or show the item(s) David
ah ok thank you! Do you know if there is any examples of where I can find how to do this? Thanks, Greg
-
Hi There, I have a virtual listview that I load up anywhere from 20,000 to 200,000 records into. The virtual listview handles this no problem at all. I am using the CSortedArray by PJ Naughter class to store a array of custom objects. I can sort the array very fast no problems. My issue occurs when I have to do a keyword search. I have to be able to allow the user to search as they type into a text box. So the user starts off with a FULL list of ALL the items appearing in the virtual listview, as the type keys into the textbox the virtual listview starts drilling down and only showing items that have the letters being typed in the name. so simple example: List ==== Greg John Steve George If the user types G in the text box the list will now look like below List ==== Greg George I have this working using a simple for loop to loop through all the items at each keystroke and remove any items from the list that do not match the keys typed. However, when my list gets to 30,000 or higher the search begins to take up WAY too much CPU. Is there some faster way I can do search like this using the CSortedArry? Thanks Greg
Greg Ellis wrote:
I have this working using a simple for loop to loop through all the items at each keystroke and remove any items from the list that do not match the keys typed. However, when my list gets to 30,000 or higher the search begins to take up WAY too much CPU.
With that many items, you are going to eat up a lot of CPU time anyway. A map or hash table would allow for more efficient searching, but with an array (even a sorted one), you are still going to be stuck with an O(n) operation. Additionally, if you have 30,000 items in the list, and your search will only find 1 entry, the list control has to do a LOT of work to remove 29,999 items. So don't be surprised to see a CPU/Memory spike for a second or two.
If you decide to become a software engineer, you are signing up to have a 1/2" piece of silicon tell you exactly how stupid you really are for 8 hours a day, 5 days a week Zac
-
Hi There, I have a virtual listview that I load up anywhere from 20,000 to 200,000 records into. The virtual listview handles this no problem at all. I am using the CSortedArray by PJ Naughter class to store a array of custom objects. I can sort the array very fast no problems. My issue occurs when I have to do a keyword search. I have to be able to allow the user to search as they type into a text box. So the user starts off with a FULL list of ALL the items appearing in the virtual listview, as the type keys into the textbox the virtual listview starts drilling down and only showing items that have the letters being typed in the name. so simple example: List ==== Greg John Steve George If the user types G in the text box the list will now look like below List ==== Greg George I have this working using a simple for loop to loop through all the items at each keystroke and remove any items from the list that do not match the keys typed. However, when my list gets to 30,000 or higher the search begins to take up WAY too much CPU. Is there some faster way I can do search like this using the CSortedArry? Thanks Greg
Instead of using a single array to hold all the data, why not break it up into say 27 arrays (one for each letter of the alphabet and numbers). You would have to create an algorithm for adding items to the correct array and choosing the correct array depending on the first character entered.
-
Hi There, I have a virtual listview that I load up anywhere from 20,000 to 200,000 records into. The virtual listview handles this no problem at all. I am using the CSortedArray by PJ Naughter class to store a array of custom objects. I can sort the array very fast no problems. My issue occurs when I have to do a keyword search. I have to be able to allow the user to search as they type into a text box. So the user starts off with a FULL list of ALL the items appearing in the virtual listview, as the type keys into the textbox the virtual listview starts drilling down and only showing items that have the letters being typed in the name. so simple example: List ==== Greg John Steve George If the user types G in the text box the list will now look like below List ==== Greg George I have this working using a simple for loop to loop through all the items at each keystroke and remove any items from the list that do not match the keys typed. However, when my list gets to 30,000 or higher the search begins to take up WAY too much CPU. Is there some faster way I can do search like this using the CSortedArry? Thanks Greg
Try something like this. It uses binary searches so it will be many, many times faster then using linear searches. It’s a simple console app which reads in words from a file (each word in the file separated by any white space). It is not limited to only matching a single charcter; you can match all words which start with "act" for example. ------------------------ #include #include #include #include #include #include #include struct PrefixMatchLess : std::binary_function { bool operator()(const std::string &s, const std::string &prefix) const { const std::string sub = s.substr(0, prefix.length()); return sub { bool operator()(const std::string &prefix, const std::string &s) const { const std::string sub = s.substr(0, prefix.length()); return sub>prefix; } }; int main() { // For notational convenience. using namespace std; typedef vector StringList_t; typedef istream_iterator StringInIt_t; typedef ostream_iterator StringOutIt_t; // Open the word list. ifstream ifs("C:\\Words.txt"); if (!ifs) { cerr << "Failed to open word list!" << endl; return 1; } // List for words. StringList_t words; // Read in the word list. copy(StringInIt_t(ifs), StringInIt_t(), back_inserter(words)); // Sort the list and remove duplicates. sort(words.begin(), words.end()); words.erase(unique(words.begin(), words.end()), words.end()); // Output the word list (diagnostic). cout << "Word list:" << endl; copy(words.begin(), words.end(), StringOutIt_t(cout, "\n")); cout << endl; for(;;) { // Get the prefix to search for. string word; cout << "Enter search prefix (! quit): "; cin >> word; // Check if the user wants to quit. if (word=="!") { break; // Quit. } cout << endl; // Calculate range. StringList_t::iterator b = lower_bound(words.begin(), words.end(), word, PrefixMatchLess()); StringList_t::iterator e = upper_bound(b, words.end(), word, PrefixMatchGreater()); // Output results. cout << "Matching words:" << endl; copy(b, e, StringOutIt_t(cout, "\n")); cout << endl; } return 0; } Steve
-
ah ok thank you! Do you know if there is any examples of where I can find how to do this? Thanks, Greg
If I understand correctly, you have a list of ordered elements, and a partial match. Binary search will help, but won't be exactly what you want. To do something similar, I use a binary search which does a length restricted string comparison to find an element which matches, and then backs up to find the first match. For my application this approach is fine, since I only have a few items in each such grouping. Your mileage may vary. However, a better approach, which someone has already suggested, might be to have a small array of indices into the array, keyed by first letter. Essentially something like this: typedef struct index { TCHAR chr; int start; } INDEX; CArray<INDEX, INDEX&>> ix; INDEX in; ix.RemoveAll(); in.chr = _T('\0'); in.start = -1; for(i = 0; i < nElements; i++) { // get first char of this item... chr = ....; if (in.chr!=chr) { // make a new index entry in.chr = chr; in.start = i; ix.Add(in); } } in.chr = _T('\0'); in.start = nElements; ix.Add(in); To do your search, look in the ix array for a matching first character, get the start index from the matching entry, and the stop index from the next entry :) This gives you a smaller space to do additional matching if necessary.
Steve S Developer for hire