开发者

Luke Lucene BooleanQuery

In Luke, the following search expression returns 23 results:

docurl:www.siteurl.com  docfile:Tomatoes*

If I pass this same expression into my C# Lucene.NET app with the following implementation:

        IndexReader reader = IndexReader.Open(indexName);
        Searcher searcher = new IndexSearcher(reader);
        try
        {
            QueryParser parser = new QueryParser("docurl", new StandardAnalyzer());
            BooleanQuery bquery = new BooleanQuery();
            Query parsedQuery = parser.Parse(query);
            bquery.Add(parsedQuery, Lucene.Net.Search.BooleanClause.Occur.MUST);
            int _max = searcher.MaxDoc();
            BooleanQuery.SetMaxClauseCount(Int32.MaxValue);
            TopDocs hits = searcher.Search(parsedQuery, _max)
            ...
        }

I get 0 results

Luke is using StandardAnalyzer and this is what the Explain Structure window looks like:

Luke Lucene BooleanQuery

Must I manually create BooleanClause objects for each field I search on, specifying Should for each one then add them to the BooleanQuery object with .Add()? I thought the QueryParser would do this for me. What am I missing?

Edit: Simplifying a tad, docfile:Tomatoes* returns 23 docs in Luke, yet 0 in my app. Per Gene's suggestion, I've changed from MUST to SHOULD:

            QueryParser parser = new QueryParser("docurl", new StandardAnalyzer());
            BooleanQuery bquery = new BooleanQuery();
            Query parsedQuery = parser.Parse(query);
            bquery.Add(parsedQuery, Lucene.Net.Search.BooleanClause.Occur.SHOULD);
            int _max = searcher.MaxDoc();
            BooleanQuery.SetMaxClauseCount(Int32.MaxValue);
            TopDocs hits = searcher.Search(parsedQuery, _max);

parsedQuery is simply docfile:tomatoes*

Edit2:

I think I've finally gotten to the root problem:

   开发者_运维问答         QueryParser parser = new QueryParser("docurl", new StandardAnalyzer());
            Query parsedQuery = parser.Parse(query);

In the second line, query is "docfile:Tomatoes*", but parsedQuery is {docfile:tomatoes*}. Notice the difference? Lower case 't' in the parsed query. I never noticed this before. If I change the value in the IDE to 'T', 23 results return.

I've verified that StandardAnalyzer is being used when indexing and reading the index. How do I force queryParser to keep the case of the value of query?

Edit3: Wow, how frustrating. According to the documentation, I can accomplish this with:

parser.setLowercaseExpandedTerms(false);

Whether terms of wildcard, prefix, fuzzy and range queries are to be automatically lower-cased or not. Default is true.

I won't argue whether that's a sensible default or not. I suppose SimpleAnalyzer should have been used to lowercase everything in and out of the index. The frustrating part is, at least with the version I'm using, Luke defaults the other way! At least I learned a bit more about Lucene.


Using Occur.MUST is equivalent to using the + operator with the standard query parser. Thus you code is evaluating +docurl:www.siteurl.com +docfile:Tomatoes* rather than the expression you typed into Luke. To get that behavior, try Occur.SHOULD when adding your clauses.


QueryParser will indeed take a query like "docurl:www.siteurl.com docfile:Tomatoes*" and build a proper query out of it (boolean query, range query, etc.) depending on the query given (see query syntax).

Your first step should be to attach a debugger and inspect the value and type of parsedQuery.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜