开发者

Why does my Lucene.net search fail when performing a fuzzy search on multiple words in the search query?

In my application I have a Company, with the name field of This is a test, which is correctly being indexed by Lucene.Net. For reference, my Multi开发者_JS百科FieldQueryParser has its default operator set to QueryParser.Operator.AND.

My search passes when I search for this test~ and this tst~. However, my search fails when I attempt to search for this~ test~, thas~ test~, thas test~, and other variations.

This whole purpose is to allow the user to misspell their search a bit, so if the user searches for Jon Doe it will still show results for John Doe, allowing users to not remember exact spelling of things they entered in the database. Unfortunately, it seems like it is only allowing fuzzy searches on the last term in the search phrase. Am I doing something wrong, or do I need to use a whole separate Analyzer in order to do this?


I recently had to implement something similar on my project.

I ended up splitting-up phrase into multiple segments and constructing the query manually.

var input = "This is a test";

var fieldName = "yourField";
var minimumSimilarity = 0.5f;
var prefixLength = 3;
var query = new BooleanQuery();

var segments = input.Split(new[] {" "}, StringSplitOptions.RemoveEmptyEntries);
foreach (var segment in segments)
{
    var term = new Term(fieldName, segment);
    var fuzzyQuery = new FuzzyQuery(term, minimumSimilarity, prefixLength);
    query.Add(fuzzyQuery, BooleanClause.Occur.SHOULD);
}

Very primitive, I know, but appears to work.

Note: this has only been tested against Lucene.net v2.3.1.3

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜