Lucene: getting the full collection documents as results
When开发者_StackOverflow中文版 I perform a query in Lucene (topDocs = searcher.search(booleanQuery, 220000);
) I get 170 hits as retrieved doc. Which is correct but I would like to have the full list of docs in the results even if the scores are very low.
Is there a way to force lucene to get the full list of documents of all my collection and not just the relevant ones ?
Or maybe it means that all other docs score is 0 ?
thanks
Since Lucene 3.x, you can use TotalHitCountCollector
to retrieve the total hits of a query. Then you can retrieve all documents for your query with the total hit count. Be careful with the case without any hits.
TotalHitCountCollector collector = new TotalHitCountCollector();
searcher.search(booleanQuery, collector);
topDocs = searcher.search(booleanQuery, Math.max(1, collector.getTotalHits()));
please specify q=*:*
as a search term
This question is old now, but I think what OP was looking for is MatchAllDocsQuery
class.
You can add some field to all docs like test:1
and then search like [your_query] OR test:1
.
It should work if you search for '*' and allow leading * in wildcard queries. Just did a test in Luke on a 501 document index, which returns 501 results for this query.
Lucene does not do any filtering based on score. If a query has 170 hits, then it means that only 170 documents matched the query. The rest of the documents did not match and can be presumed to have a score of 0.
I have the same question and I couldn't find a satisfactory answer anywhere. I had read that you could just use IndexSearcher.search(query, Integer.MAX_VALUE), however this seemed to be very slow so I presume memory is being allocated for the result set somewhere. I really don't know why Lucene doesn't already provide a way to get the entire result set, but here's my solution...
TotalHitCollector collector = new TotalHitCollector();
indexSearcher.search(query, collector);
if (collector.getTotalHits() != 0) {
for (int i = 0; i < collector.getTotalHits(); i++) {
Document doc = indexSearcher.doc(collector.getDoc(i));
}
}
... and the TotalHitCollector class...
public static class TotalHitCollector extends SimpleCollector {
private int base;
private final List<Integer> docs = new ArrayList<>();
public int getTotalHits() {
return docs.size();
}
public int getDoc(int i) {
return docs.get(i);
}
@Override
public void collect(int doc) {
doc += this.base;
docs.add(doc);
}
@Override
protected void doSetNextReader(LeafReaderContext context) {
this.base = context.docBase;
}
@Override
public ScoreMode scoreMode() {
return ScoreMode.COMPLETE_NO_SCORES;
}
}
精彩评论