Faceted Search without duplication of data (no ETL)

2023-03-11 17:41 问答作者：

All solutions I've seen so far involve duplication of data by using nosql or datawarehousing. Are there more efficient ways?

2011-06-07 EDIT: When I say no duplication I 开发者_高级运维mean no ETL either. I would like to extract data directly from main database. It's relational but I'm in time to change.

There is a patch for Solr that adds field collapsing. It works fairly well except the problems are reported when the returned result set is millions documents long.

Also, it doesn't calculate facet numbers very precisely - sometimes the total number of all the facets doesn't tally with the number of documents in the set. However, the difference always seems to be not that big - I noticed the fluctuations of less than 100 for result set of 10000-50000 documents.

Obviously, to use this patch you'll have to build your own version of Solr. If you're not comfortable with that you can try the already built version I am using. I have uploaded to my SkyDrive both a patched .war file and my "lib" folder (not sure if the latter is necessary and if the patch does any changes to libraries, but just in case they also there). Also I need to mention that this version should be used on your own risk only - they serve me without any serious complaints, but I can't guarantee the same for others. Here's the download link.

Alternatively, you can wait for Solr 4 to be released - it will include field collapsing but it still bore unresolved critical issues last time I checked. By the way, its collapsing search parameters won't be compatible with the patch described above, so you use first one and then another you'll need to amend your code as well.

继续阅读：data-warehouse faceted-search key-value-store nosql solr

Faceted Search without duplication of data (no ETL)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？