开发者

Can we tell Solr/Lucene max chars to analyze for a search?

I have a problem that in my lucene index files one document can have huge text. now when i search one of these huge text documents lucene/solr does not filter any results even the search term exist in the document text. the reason that i think might be the large number of characters in document text? if yes than how could we tell solr/lucene how much characters to analyze during search, please explain

I am using Solr 1.4.1 c开发者_如何学Goan any

Thanks Ahsan


Lucene can handle huge documents without trouble. It seems unlikely that the document size itself is the problem. Use a tool like Luke to inspect the index and see what terms are associated with some of these large documents.


Also, have you changed the maxFieldLength setting in solrconfig.xml? I am testing out indexing the Bible, at 25 MB of data, and with a maxFieldLength of 10,000, which is the default, only the first 10,000 tokens ever get analysized, which leads to roughly ~2000 unique terms for my document.

If you are using Lucene directly, then there are a couple setting for maxFieldLength, you may have "unlimited" and therefore getting everything. Check the JavaDocs for how to set maxFieldLength.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜