distinct SOLR field values without count

2023-01-07 11:14 问答作者：

My question is pretty similar to this question

The difference, I'd nee开发者_JAVA百科d the least RAM intensive way to gather information about the distinct values. I DON'T care for the actual count in this case, I just want to know the possible values for that field.

I'm constantly running out of heap space (30 million+ documents) and there has to be some way/parameter to do this in a memory saving way

If the number of distinct values is high, you will probably need to do facet paging. Use the facet.offset and facet.limit parameters.

Use the StatsComponenet to retrieve a list of distinct values for a certain field: https://cwiki.apache.org/confluence/display/solr/The+Stats+Component

Parameter stats.calcdistinct:

If true, distinct values will be calculated and returned as "countDistinct" and "distinctValues" in the response. This calculation may be expensive for some fields, so it is false by default. If you'd only like to return distinct values for specific fields, you can also specify f..stats.calcdistinct, replacing with your field name, to limit the distinct value calculation to the required field.

To keep the load down, retrieve it as few times as possible and cache the results and only retrieve again when the data has changed.

If your index is slow in general you might want to have a look at the cache configuration and/or give SOLR more RAM (if you have the means).

Originally answered here (by me):

https://stackoverflow.com/a/26714447/621690

I don't know about RAM usage, but you might wanna try Field collapsing You will find the patch for Solr here.

继续阅读：facet solr

distinct SOLR field values without count

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？