Java Lucene English Stemmer? [closed]
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this questionI need help indexing and searching English text using Java Lucene over Google App Engine. The only solution I have found so far was the SnowballAnalyzer (in the contrib packages), but it only supports Lucene 3.0, and GAELucene only supports lucene 2.3.1. Just changing jars doesn't really work.
C开发者_C百科an anyone help me index my text with an English stemmer?
The SnowballAnalyzer
has been with Lucene for a long time now, including 2.x versions (see its entry in the 2.4.1 API docs).
Bizarrely, though, it doesn't come as part of the standard Lucene distribution, even if it is in the documentation. You'll have to hunt down a version of the contrib package that is to be used for 2.3.1.
Edit: Looks like there's a copy here.
The PorterStemFilter is in the lucene core. It can be used with the StandardAnalyzer for english stemming.
Various companies also sell more sophisticated and/or speedier alternatives to Porter Stemmers implemented in a Snowball interpreter. If you have needs in that direction, post a comment and I'll elaborate, but I don't want to get accused of unjustified advertising, so I'll leave it there for now.
You can use lucene-2.3.1.zip or its neighboring files in the Lucene archive. I am unsure, however, about the degree of customization available from GAELucene. It does not appear to be open to accept arbitrary analyzers.
精彩评论