Lucene SpanNearQuery

2023-01-17 06:16 问答作者：

I am trying to understand Lucene SpanNearQuery and wrote up a dummy example. I am looking for "not" followed by "fox" within 5 of each other. I would expect document 3 to be returned as the only hit. However, I end up getting no hits. An开发者_如何学运维y thoughts on what might I be doing wrong will be appreciated.

Here is the code:

//indexing

public void doSpanIndexing()  throws IOException {   

IndexWriter writer=new IndexWriter(directory, AnalyzerUtil.getPorterStemmerAnalyzer(new StandardAnalyzer(Version.LUCENE_30)),IndexWriter.MaxFieldLength.LIMITED);

 Document doc1=new Document();
 doc1.add(new Field("content", " brown fox jumped ", Field.Store.YES, Index.ANALYZED,  Field.TermVector.WITH_POSITIONS_OFFSETS));
 writer.addDocument(doc1);


 Document doc2=new Document();
 doc2.add(new Field("content", "foxes not jumped over the huge fence", Field.Store.YES, Index.ANALYZED,Field.TermVector.WITH_POSITIONS_OFFSETS));
 writer.addDocument(doc2);

 Document doc3=new Document();
 doc3.add(new Field("content", " brown not fox", Field.Store.YES, Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS));
 writer.addDocument(doc3);


}

//searching

public void doSpanSearching(String text) throws CorruptIndexException, IOException, ParseException {

 IndexSearcher searcher=new IndexSearcher(directory);

 SpanTermQuery term1 = new SpanTermQuery(new Term("content", "not"));
 SpanTermQuery term2 = new SpanTermQuery(new Term("content", text));
 SpanNearQuery query = new SpanNearQuery(new SpanQuery[] {term1, term2}, 5, true);
 TopDocs topDocs=searcher.search(query,5);

for(int i=0; i<topDocs.totalHits; i++) {
   System.out.println("Hit Document number: "+topDocs.scoreDocs[i].doc);
   System.out.println("Hit Document score: "+topDocs.scoreDocs[i].score);
   Document result=searcher.doc(topDocs.scoreDocs[i].doc);
   System.out.println("Search result "+(i+1)+ " is "+result.get("content"));

  }

}

"Not" is a stop word in the standard analyzer (i.e. it is removed from your text). Can you try it with another word which is not a stop word?

继续阅读：lucene

Lucene SpanNearQuery

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？