How do I do a full text search in Sql Server 2008 where the data contains multiple languages?
I have a database table in Sql Server 2008 R2 which contains data stored in multiple languages including English, Swedish, Hungarian and German.
The table uses the Latin1_General_CI_AS collation. The full text catalog has the table assigned to it with an index on the multi-language column.
I have two problems:
- In the catalog properties, a language has to be s开发者_如何学Pythonpecified for word breaks. This is currently set to English. How do I get it to use multiple languages for word breaks?
- Hungarian is not even available in the list of languages that can be selected for word breaks. How do I configure the full text search to search Hungarian text?
Each row in the table contains only a single language.
According to Microsoft, sys.fulltext_languages, Hungarian is not a supported language for Full Text Search.
The full list of supported languages is at http://msdn.microsoft.com/en-us/library/ms176076.aspx
It also appears that you are going to have to choose one language or another:
http://blogs.msdn.com/b/sqlcat/archive/2008/11/06/best-practices-for-integrated-full-text-search-ifts-in-sql-2008.aspx
Handling multiple languages in a single document is a hard problem. Which word breaker do you use to shred the original document, and which language are you going to specify for the query? For example, if you have a document with Korean and English and you use the Korean word breaker to process the document, then if you search the document for English words it will only find the exact words and not any other forms of the words (like ing and s).
精彩评论