Remove a noise word in a SQL Server 2008 database
-
Hello all. I created a DB with a single table, with two columns enabled for full-text search. Everything works fine. However, whenever I lookup for a word ("salud", with means "health" in Spanish), it throws no results:
select id, vision, age, gender, [state], tier1
from visions
where contains(tier1, 'Salud')Frustrated with this, I started looking on the causes. I came across an article that suggested to run a parse on the word, and so I did:
SELECT * FROM sys.dm_fts_parser (' "Salud" ', 3082, 0, 0)
This returned the following record:
0x00730061006C00750064 1 0 1 Noise Word salud 0 Salud
So, to my dismay, "salud" is a Noise Word. I've been trying to search how to stop "salud" from being a noise word, but all I find is reference to some noisees files, which I'm unable to find in my SQL Server 2008 installation. Thus my question: does anybody know where can I remove "salud" from being a noise word? Any clue on this matter will be reaaaally appreciated! Thanks in advance. Best regards.
-
Hello all. I created a DB with a single table, with two columns enabled for full-text search. Everything works fine. However, whenever I lookup for a word ("salud", with means "health" in Spanish), it throws no results:
select id, vision, age, gender, [state], tier1
from visions
where contains(tier1, 'Salud')Frustrated with this, I started looking on the causes. I came across an article that suggested to run a parse on the word, and so I did:
SELECT * FROM sys.dm_fts_parser (' "Salud" ', 3082, 0, 0)
This returned the following record:
0x00730061006C00750064 1 0 1 Noise Word salud 0 Salud
So, to my dismay, "salud" is a Noise Word. I've been trying to search how to stop "salud" from being a noise word, but all I find is reference to some noisees files, which I'm unable to find in my SQL Server 2008 installation. Thus my question: does anybody know where can I remove "salud" from being a noise word? Any clue on this matter will be reaaaally appreciated! Thanks in advance. Best regards.
The "stop words" are in a system view.
SELECT [stopword], [language_id]
FROM [master].[sys].[fulltext_system_stopwords]
where language_id=3082I tried a DELETE query:
delete
FROM [master].[sys].[fulltext_system_stopwords]
where [stopword]='salud' and language_id=3082but that caused an exception: "Ad hoc updates to system catalogs are not allowed" I do not have further ideas...
-
Hello all. I created a DB with a single table, with two columns enabled for full-text search. Everything works fine. However, whenever I lookup for a word ("salud", with means "health" in Spanish), it throws no results:
select id, vision, age, gender, [state], tier1
from visions
where contains(tier1, 'Salud')Frustrated with this, I started looking on the causes. I came across an article that suggested to run a parse on the word, and so I did:
SELECT * FROM sys.dm_fts_parser (' "Salud" ', 3082, 0, 0)
This returned the following record:
0x00730061006C00750064 1 0 1 Noise Word salud 0 Salud
So, to my dismay, "salud" is a Noise Word. I've been trying to search how to stop "salud" from being a noise word, but all I find is reference to some noisees files, which I'm unable to find in my SQL Server 2008 installation. Thus my question: does anybody know where can I remove "salud" from being a noise word? Any clue on this matter will be reaaaally appreciated! Thanks in advance. Best regards.
-
Hello all. I created a DB with a single table, with two columns enabled for full-text search. Everything works fine. However, whenever I lookup for a word ("salud", with means "health" in Spanish), it throws no results:
select id, vision, age, gender, [state], tier1
from visions
where contains(tier1, 'Salud')Frustrated with this, I started looking on the causes. I came across an article that suggested to run a parse on the word, and so I did:
SELECT * FROM sys.dm_fts_parser (' "Salud" ', 3082, 0, 0)
This returned the following record:
0x00730061006C00750064 1 0 1 Noise Word salud 0 Salud
So, to my dismay, "salud" is a Noise Word. I've been trying to search how to stop "salud" from being a noise word, but all I find is reference to some noisees files, which I'm unable to find in my SQL Server 2008 installation. Thus my question: does anybody know where can I remove "salud" from being a noise word? Any clue on this matter will be reaaaally appreciated! Thanks in advance. Best regards.
Eddy Vluggen supplied an important hint - but the version is different. Look at http://msdn.microsoft.com/en-us/library/cc280871%28v=sql.100%29.aspx[^]
-
The "stop words" are in a system view.
SELECT [stopword], [language_id]
FROM [master].[sys].[fulltext_system_stopwords]
where language_id=3082I tried a DELETE query:
delete
FROM [master].[sys].[fulltext_system_stopwords]
where [stopword]='salud' and language_id=3082but that caused an exception: "Ad hoc updates to system catalogs are not allowed" I do not have further ideas...
Hi, thanks so much for your help! I finally managed to remove the noise words. It seems that MSSQLSVR2K8R2 (Uf!) has hardcoded somewhere (apparently some "resource" database, whatever that is) the noise words for each language. So, what I did was: 1.- Create my own stoplist[^] 2.- Through the SQL Management Studio, open the full-text index properties and then select the recently created stoplist rather than the default. 3.- Alter[^] the stoplist and drop the noise word. 4.- Re-index the full-text index. Thanks again for your help! :)
-
Yeh, tried looking for those files, yet didn't find 'em on MSSQLSVR 2008. I found the Thresaurus, mind you, but not the words. I read in MSDN that in 2008 they were stored in the "Resources" database, yet I wasn't able to find 'em. Thanks for the insight!
-
Eddy Vluggen supplied an important hint - but the version is different. Look at http://msdn.microsoft.com/en-us/library/cc280871%28v=sql.100%29.aspx[^]
Ah, yes, that did it! Find 'em yesterday, very late at night, when I was cursing the DB builders. :laugh: Thanks, cheers!
-
Eddy Vluggen supplied an important hint - but the version is different. Look at http://msdn.microsoft.com/en-us/library/cc280871%28v=sql.100%29.aspx[^]
Why the heck is "salud" (health) a noise word, escapes my comprehension though.
-
Yeh, tried looking for those files, yet didn't find 'em on MSSQLSVR 2008. I found the Thresaurus, mind you, but not the words. I read in MSDN that in 2008 they were stored in the "Resources" database, yet I wasn't able to find 'em. Thanks for the insight!