开发者

Java Lucene English Stemmer? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 2 years ago.

Improve this question

I need help indexing and searching English text using Java Lucene over Google App Engine. The only solution I have found so far was the SnowballAnalyzer (in the contrib packages), but it only supports Lucene 3.0, and GAELucene only supports lucene 2.3.1. Just changing jars doesn't really work.

C开发者_C百科an anyone help me index my text with an English stemmer?


The SnowballAnalyzer has been with Lucene for a long time now, including 2.x versions (see its entry in the 2.4.1 API docs).

Bizarrely, though, it doesn't come as part of the standard Lucene distribution, even if it is in the documentation. You'll have to hunt down a version of the contrib package that is to be used for 2.3.1.

Edit: Looks like there's a copy here.


The PorterStemFilter is in the lucene core. It can be used with the StandardAnalyzer for english stemming.


Various companies also sell more sophisticated and/or speedier alternatives to Porter Stemmers implemented in a Snowball interpreter. If you have needs in that direction, post a comment and I'll elaborate, but I don't want to get accused of unjustified advertising, so I'll leave it there for now.


You can use lucene-2.3.1.zip or its neighboring files in the Lucene archive. I am unsure, however, about the degree of customization available from GAELucene. It does not appear to be open to accept arbitrary analyzers.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜