Apache Solr Failover Support in Master-Slave Setup

2023-03-13 17:23 问答作者：

Our development team is currently looking into migrating our search system to Apache Solr, and we would greatly appreciate some advice on setup. We are indexing approximately two hundred million database rows. We add about a hundred thousand new rows throughout the day. These new database rows must be searchable within two minutes of their receipt.

We don't want the indexing to bog down the searcher, so our thought is to have two Solr servers running on different machines in a replication setup. The first Solr instance will be the indexer. It will use the DataImportHandler to index the delta and have autocommit enabled to prevent overzealous commit rates. Index optimization will take place during scheduled periods. The second Solr instance (the slave) will be the primary searcher and will have its indexes stored on RAIDed solid state drives.

What we are concerned about is failover. Our searches are mission-critical. If the primary searcher goes down for whatever reason, our search service will automatically shunt queries over to the indexer node instead. Indexing is equally critical, though. If the indexer dies, we need to have a warm failover standing by. Is there a recommended way to automate master 开发者_开发技巧node failover in Solr replication? I've begun looking into ZooKeeper, but I wasn't sure if this was the best approach.

As you've identified search failover can be handled using replication.

Master failover is a little bit more tricky. One idea to something like the following logical setup

+--------+       +--------+
|  Slave |  ...  |  Slave |
+--------+       +--------+
     |               |
     v (replicate)   v
+---------------------------+
|     Load balancer         |
+---------------------------+
         /         \
        v           v
+--------+       +--------+
| Master | --->  | Master |
+--------+       +--------+

To keep Master indices up to date repeater mode can be used where a hot-backup master can replicate from the primary master
Either
- Use something like the Ping handler on the primary master as a keep-alive notification. If it cannot be reached, write a small programmatic component which triggers the data import-handler of the secondary master to take over.
- Keep the data import handlers active on all master servers, allowing any of them to take over operation without additional configuration.

Note that you might need to configure the load balancer such that a slave can only replicate from one master at any point in time.

On a side note, it would be interesting to hear some of your experiences indexing such a huge data set.

继续阅读：backup failover replication solr

Apache Solr Failover Support in Master-Slave Setup

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？