开发者

SQL Server Full text Search dilemma

I have a table (Catalog) with one column (Title). This column stores information about a song (artist, title, remix). I have a situation where I need to find matches from a search term.

I enabled SQL Server FTS and created an FTS catalog using the Title column. I started out testing using FREETEXTTABLE where I pass in the search term.

I found that this returns way to many non-relevant results although the top ranked results usually are right IF the title exists in the Catalog table. One approach I had was to convert the RANK to percent and only display results where the percent is greater than 90. The problem is that the query still returns non-relevant results IF the title does not exist in the Catalog table.

The alternative is to use CONTAINSTABLE. The problem here is that I would have to dynamically generate the query in code (break down the words) creating something like:

SELECT DISTINCT ft.[rank], 开发者_如何学C[Id]
FROM CONTAINSTABLE(Catalogs, Title, '"artist" AND "title" AND "remix"') AS ft
JOIN [Catalogs] ON [Catalogs].[Id] = ft.[KEY]

The dilemma is that there are no separators in the search term. This means I can't logically break up the string into artist and title but rather just words. Sometimes other terms are included like label which will cause the above query to fail.

I am a little bit stuck. Has anyone dealt with a similar problem using SQL Server FTS? Is there an approach in between CONTAINSTABLE and FREETEXTTABLE?


We had a similar problem where the users were allowed to enter a free query search string, but we had to use CONTAINS as FREETEXT was returning too many false positives. We ended up writing our own search-term parsing routine on the business layer that scrubs the string and replaces any spaces with AND. This of course has to be clever enough to accommodate logical grouping (when people use brackets) and multiple spaces. It seems to work well for us.

I am a bit curious about the structure of the data in your column. If artist, title and remix are really independent bits of information, would it not make sense to maintain them as distinct columns and query them individually?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜