SQL Query For String Searching
-
My case is Input is "abdknmgbm". BUT The record in database that i want to query out is "abkdnmgmb" The input and database record is quite similar. String length is same, only some arrangement of character are not same. In the above case, dk and kd ; bm and mb. How i write query for this string searching? Help and Thanks.
-
My case is Input is "abdknmgbm". BUT The record in database that i want to query out is "abkdnmgmb" The input and database record is quite similar. String length is same, only some arrangement of character are not same. In the above case, dk and kd ; bm and mb. How i write query for this string searching? Help and Thanks.
:snork: A while back I wrote a Levenshtein Distance function (CLR) for SQL Server. :-D Such could be used and tested for "less than some threshold". The Levenshtein Distance between your samples is 4. http://en.wikipedia.org/wiki/Levenshtein_distance[^]
-
:snork: A while back I wrote a Levenshtein Distance function (CLR) for SQL Server. :-D Such could be used and tested for "less than some threshold". The Levenshtein Distance between your samples is 4. http://en.wikipedia.org/wiki/Levenshtein_distance[^]
Actually the swap char will not always are char 3 and 4 and last 2 chars. For example it may also be like this : Input: " abcdkazbmopdkubmgt " The wanted result is : "abckdazmbopkdumbgt" Input and wanted result length are same. BUT the input string length is not always the same. Very hard to solve this. :( Tq.
-
Actually the swap char will not always are char 3 and 4 and last 2 chars. For example it may also be like this : Input: " abcdkazbmopdkubmgt " The wanted result is : "abckdazmbopkdumbgt" Input and wanted result length are same. BUT the input string length is not always the same. Very hard to solve this. :( Tq.
So? Those have a Levenshtein Distance of 10 (8 if you Trim) -- maybe that's within your threshdold, I dunno.
-
My case is Input is "abdknmgbm". BUT The record in database that i want to query out is "abkdnmgmb" The input and database record is quite similar. String length is same, only some arrangement of character are not same. In the above case, dk and kd ; bm and mb. How i write query for this string searching? Help and Thanks.
-
So? Those have a Levenshtein Distance of 10 (8 if you Trim) -- maybe that's within your threshdold, I dunno.
I have read about Levenshtein Distance. New knowledge for me. Thx. But how can i apply into my coding? I am using Asp.net with C#. I have no idea about this.
-
I have read about Levenshtein Distance. New knowledge for me. Thx. But how can i apply into my coding? I am using Asp.net with C#. I have no idea about this.
If you think it will do what you need you can write a function in C# or VB and add it to SQL Server (I assume you're using SQL Server). As to how to add it to SQL Server, I'd rather write a tip than post it here. Given that the function has the signature
int Levenshtein ( string , string )
you would then be able to say:SELECT * FROM sometable WHERE dbo.Levenshtein ( somefield , @somevalue ) < @somethreshold
The threshold should probably be based on the lengths of the strings, for instance half the length of the shorter string. To make things more efficient, you could write a function with the signaturebool IsSimilar ( string , string , int )
that will return false as soon as the Levenshtein Distance between the strings exceeds the threshold (the third parameter).SELECT * FROM sometable WHERE dbo.IsSimilar ( somefield , @somevalue , @somethreshold ) = 1
-
If you are using SQL server try SOUNDEX. Do not know if it will work though.
Cannot work. Because they are not in english word.
-
lhsunshine wrote:
How i write query for this string searching
Very inefficiently. If you have a lot of data then you should probably look for another way to implement the search.
Inefficient is better than nothing or wrong. :-D
-
Inefficient is better than nothing or wrong. :-D