开发者

sphinx dash in author names causing problems when searching

I've read all the posts about dashes and tried pretty much everything mentioned in them, yet cannot figure out a strange problem I'm having.

For example, I have an author name like this:

Arturo Pérez-Reverte

A search for 'pérez-reverte' will not turn up anything, nor will 'pérez-reverte' so escaping the dash is not the issue. But a search for 'spider-man' will return hits, proving that the dash seems to be working. However, a search for 'perez reverte' also finds a hit because it searches each word separately and finds the 'reverte' in 'perez-reverte' (but doesn't seem to find the 'perez').

A search for either 'pérez' or 'perez' finds the same number of documents, suggesting that the accent is not an issue (I do have a charset_table which accounts for accented characters).

So I'm very confused as to what's happening here. It if it isn't the accent and it isn't the dash, what could it be?

I don't have any ignore_chars set, I'm using UTF-8 and have a charset_table to treat accented characters as regular characters.

The only difference between these two terms is that one of them is a title (spider-man) and the other an author, but they are both part of the same Sphinx index declaration, so I don't s开发者_JAVA技巧ee that as an issue in any way.

Any help would be greatly appreciated.


After much fighting with it, I found out that even though my database is all UTF-8 with the proper collation I needed to add this in sphinx.conf for everything to work properly:

sql_query_pre = SET NAMES utf8
sql_query_pre = SET CHARACTER SET utf8 

After doing that, and having the proper charset_table, everything seems to be working fine.

Hope this helps someone else.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜