What are the challenges in embedding text search (Lucene/Solr/Hibernate Search) in applications that are hosted at client sites

2023-02-18 01:12 问答作者：

We have a enterprise java web-app that our customers (external) deploy on their intranets. I am exploring different full text search options: Lucene/Solr/Hibernate Search and one common concern is deployment/administration/tuning overhead for this.

This is particularly challenging in our case, since we do not host these applications. From what I have seen, most uses of these technologies have been in hosted applications. Our customers t开发者_JAVA百科ypically deploy our application in a clustered environment and do not have any experience with Lucene/Solr.

Does anyone have any experience with this? What challenges have you encountered with this approach? How did you overcome them? At this point I am trying to determine if this is feasible.

Thank you

It is very feasible to deploy applications onto clients sites that use Lucene (or Solr).

Some things to keep in mind: Administration

you need a way to version your index, so if/when you change the document
structure in the index, it can be
upgraded.
you therefore need a good way to force a re-index of all existing data. Probably also a good idea to provide an Admin option to allow an Admin to trigger a re-index as well.
you could also provide an Admin option to allow optimize() be called on your index, or have this scheduled. Best to test the actual impact this will have first, since it may not be needed depending on the shape of your index

Deployement If you are deploying into a clustered environment, the simplest (and fastest in terms of dev speed and runtime speed) solution could be to create the index on each node.

Tuning * Do you have a reasonable approximation of the dataset you will be indexing? You will need to ensure you understand how your index scales (in both speed and size), since what you consider a reasonable dataset size, may not be the same as your clients... Therefore, you at least need to be able to let clients know what factors will lead to overly large index size, and possibly slower performance.

There are two advantages to embedding lucene in your app over sending the queries to a separate Solr cluster, performance and ease of deployment/installation. Embedding lucene means to run lucene in the same JVM which means no additional server round trips. Commits should be batched in a separate thread. Embedding lucene also means including some more JAR files in your class path so no separate install for Solr.

If your app is cluster aware, then the embedded lucene option becomes highly problematic. An update to one node in the cluster needs to be searchable from any node in the cluster. Synchronizing the lucene index on all nodes yields no better performance than using Solr. With Solr 4, you may find the administration to be less of a barrier to entry for your customers. Check out the literature of the grossly misnamed Solr Cloud.

继续阅读：deployment lucene solr

What are the challenges in embedding text search (Lucene/Solr/Hibernate Search) in applications that are hosted at client sites

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？