Why does my Lucene.net search fail when performing a fuzzy search on multiple words in the search query?
In my application I have a Company, with the name field of This is a test
, which is correctly being indexed by Lucene.Net. For reference, my Multi开发者_JS百科FieldQueryParser
has its default operator set to QueryParser.Operator.AND
.
My search passes when I search for this test~
and this tst~
. However, my search fails when I attempt to search for this~ test~
, thas~ test~
, thas test~
, and other variations.
This whole purpose is to allow the user to misspell their search a bit, so if the user searches for Jon Doe
it will still show results for John Doe
, allowing users to not remember exact spelling of things they entered in the database. Unfortunately, it seems like it is only allowing fuzzy searches on the last term in the search phrase. Am I doing something wrong, or do I need to use a whole separate Analyzer in order to do this?
I recently had to implement something similar on my project.
I ended up splitting-up phrase into multiple segments and constructing the query manually.
var input = "This is a test";
var fieldName = "yourField";
var minimumSimilarity = 0.5f;
var prefixLength = 3;
var query = new BooleanQuery();
var segments = input.Split(new[] {" "}, StringSplitOptions.RemoveEmptyEntries);
foreach (var segment in segments)
{
var term = new Term(fieldName, segment);
var fuzzyQuery = new FuzzyQuery(term, minimumSimilarity, prefixLength);
query.Add(fuzzyQuery, BooleanClause.Occur.SHOULD);
}
Very primitive, I know, but appears to work.
Note: this has only been tested against Lucene.net v2.3.1.3
精彩评论