开发者

Full Text Search: Noise words are being searched for

I have a database in SQL Server 2008 with Full Text Search indexes. I have defined the Stopword 'al' in the Stoplist. However, when I search for any phrase with the keyword 'al', the word 'al' is still uesd in ranking.

This might be related to the fact that I am breaking up search terms, and reconstructing them. I am then searching across multiple fields and ranking the results: http://pastebin.com/fdce11ff. This functions to break up a search

'al hamra' 
开发者_StackOverflow社区

into

("*al*" ~ "*hamra*") OR ("*al*" OR "*hamra*") 

for the Full Text Search.

Imagine this scenario:

Name: Al Hamra, Author: Jack Brown, Genre: Fiction Al Karawan, Author: Al Hanz, Genre: Romance

Now a search for 'al hamra' will return 'Al Karawan', in spite of the fact that 'al' is in the stoplist. Why is this? I thought stoplists would cause words to lose their weightage?


Noise words are specific to code pages; have you added it to the right one? You can use sys.dm_fts_parser to test it (below) this also might work better than your manual word breaking in the code (or not).

SELECT special_term, display_term
FROM sys.dm_fts_parser
  (' "al hamra" ', 1033, 0, 0)

Assuming you are using code page 1033. If your noise word is in the code page you expect then it should be visible as a noiseword in the list.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜