开发者

Full text search for irregular rapper names with Solr

I'm implementing full text search functionality on my rap website, and I'm running into some issues with rapper and 开发者_运维技巧song names.

For example, someone might want to search for the rapper "Cam'ron" using the query "camron" (leaving out the mid-word apostrophe). Likewise, someone might search for the song "3 Peat" using the query "3peat".

"The Notorious B.I.G." is a bit of a weird case: "The Notorious BIG" and "The Notorious B.I.G." both work (I guess because the solr.StandardFilterFactory removes dots from acronyms?), but "The Notorious B.I.G" (i.e., minus the trailing dot) doesn't.

Ideally all reasonable variations of these names should work. I'm guessing the answer has something to do with the solr.WordDelimiterFilterFactory, but I'm not sure.

Also, I'm using Sunspot with Rails if that's relevant.


Yes, you are right. You need to configure WordDelimiterFilterFactory properly. Try to enable all properties and don't forget to enable preserveOriginal property, which will save your original terms also.

generateWordparts - will make from B.I.G. terms - B I G

generateNumberParts - will make from 3Peat terms - 3 Peat

catenateWords - will make from B.I.G. terms - BIG

catenateNumbers - will make from Rapper 802.11 terms - Rapper 80211

catenateAll - will make from Rapper-802.11 term - Rapper80211

splitOnCaseChange - will make from GanGsTa terms - Gan Gs Ta

preserveOriginal - will save also original term. From Rapper-802.11RuuLlZ will make - Rapper-802.11RuuLlZ.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜