开发者

Return stemmed word in Solr

We have stemming in our Solr search and we need to retrieve the word/phrase after stemming. That is if I search for "oranges", through stemming a search for "orange开发者_Go百科" is carried out. If I turn on debugQuery I would be able to see this, however we'd like to access it through the result if possible. Basically, we need this, because we pass the searched word as a parameter to a 3rd party application which highlights the word in an online PDF reader. Currently, if a user searches for "oranges" and a document contains "orange", then the PDF wouldn't highlight anything since it tries to highlight "oranges" not "orange".

Thanks all in advance,

Krt_Malta


I've no experience with Solr but if you need it just for presentation to users you could stem their queries using the same stemmer Solr uses yourself. This would probably be faster since it would avoid a trip to Solr's index. For English this would presumably be http://tartarus.org/~martin/PorterStemmer/ - or you could check Solr's implementation.

However, a word of caution, most stemming algorithms do not guarantee that stemmed words will be actual words. Check here http://snowball.tartarus.org/algorithms/english/stemmer.html for examples.


You could use the implicit analysis request handler to get the stemmed word.

For your example, if you are using the text_en field and the Snowball Stemmer, the URL

<YOUR SOLR HOST>/solr/<YOUR COLLECTION>/analysis/field?analysis.query=oranges&analysis.fieldtype=text_en&verbose_output=1

would give you a json response, including the following:

"org.apache.lucene.analysis.snowball.SnowballFilter",
      [
        {
          "text": "orang",
             ...
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜