Lucene Analyzer for Indexing and Searching
I have a field that I am indexing with Lucene like so:
@Field(name="hungerState", index=Index.TOKENIZED, store=Store.YES)
public HungerState getHungerState() {
The possible values of this field are HUNGRY, SLIGHTLY_HUNGRY, and NOT_HUNGRY
When these values are indexed using the StandardAnalyzer
, the terms end up as hungry, slightly
since it tokenizes on punctuation and ignores the "not".
If I change the index to index=Index.UN_TOKENIZED
, the indexed terms are HUNGRY, SLIGHTLY_HUNGRY, and NOT_HUNGRY
, as expected.
My search API has 1 "search" method that constructs the Query
like so:
MultiFieldQueryParser parser = new MultiFieldQueryParser(Version.LUCENE_30, getSearchFields(), new StandardAnalyzer(Version.LUCENE_30));
parser.setDefaultOperater(Que开发者_JS百科ryParser.AND_OPERATOR);
Query query = parser.parse(searchTerms);
This handles searches where searchTerms = "foo", which searches all fields returned by getSearchFields()
on "foo", and also where searchTerms specifies fields and values to search (ie "hungerState:HUNGRY")
My problem is with the latter scenario. Since the query parser is using a StandardAnalyzer, searches for hungerState:SLIGHTLY_HUNGRY
get parsed into hungerState:"slightly hungry"
and searches for hungerState=NOT_HUNGRY
get parsed into hungerState=hungry
.
When the field is indexed using the StandardAnalyzer, I get unexpected results (searches for HUNGRY and NOT_HUNGRY return results for all 3 values). When the field is indexed as UN_TOKENIZED, I don't get any results since the query parser tokenizes the search string and makes it lowercase.
I've even tried specifying an Analyzer for indexing like KeywordAnalyzer
, but it pretty much has no effect since the entire search string is analyzed with StandardAnalyzer
every time.
Any advice would be appreciated. Thanks!
You're using a standard analyzer for your query parser, so yes your query will be analyzed with a standard analyzer. Just switch to using a keyword analyzer:
MultiFieldQueryParser parser = new MultiFieldQueryParser(Version.LUCENE_30, getSearchFields(),
new KeywordAnalyzer(Version.LUCENE_30));
You may want to use a PerFieldAnalyzerWrapper if your other fields aren't keywords.
精彩评论