Full text search for irregular rapper names with Solr
I'm implementing full text search functionality on my rap website, and I'm running into some issues with rapper and 开发者_运维技巧song names.
For example, someone might want to search for the rapper "Cam'ron" using the query "camron" (leaving out the mid-word apostrophe). Likewise, someone might search for the song "3 Peat" using the query "3peat".
"The Notorious B.I.G." is a bit of a weird case: "The Notorious BIG" and "The Notorious B.I.G." both work (I guess because the solr.StandardFilterFactory removes dots from acronyms?), but "The Notorious B.I.G" (i.e., minus the trailing dot) doesn't.
Ideally all reasonable variations of these names should work. I'm guessing the answer has something to do with the solr.WordDelimiterFilterFactory, but I'm not sure.
Also, I'm using Sunspot with Rails if that's relevant.
Yes, you are right. You need to configure WordDelimiterFilterFactory properly. Try to enable all properties and don't forget to enable preserveOriginal property, which will save your original terms also.
generateWordparts - will make from B.I.G. terms - B I G
generateNumberParts - will make from 3Peat terms - 3 Peat
catenateWords - will make from B.I.G. terms - BIG
catenateNumbers - will make from Rapper 802.11 terms - Rapper 80211
catenateAll - will make from Rapper-802.11 term - Rapper80211
splitOnCaseChange - will make from GanGsTa terms - Gan Gs Ta
preserveOriginal - will save also original term. From Rapper-802.11RuuLlZ will make - Rapper-802.11RuuLlZ.
精彩评论