Java Lucene Obtain Terms from Document object
I understand how to obtain 开发者_Python百科the document set from a Term object, but can you go the other way around to obtain the terms/term frequencies from a Document object?
Yes, it is possible get terms from a document, but there are no easy APIs. IndexReader has a a method getTermFreqVector where you can retrieve terms in a document. You need to build a custom TermVectorMapper and pass it getTermFreqVector().
In the custom TVMapper, terms and their frequencies are collected in map()
method. Once the getTermFreqVector()
returns, terms can be retrieved from the custom TVMapper.
精彩评论