开发者

Solr healthcheck for >0 documents

The default configuration for solr of /admin/ping provided for load balancer health check integrates well with the Amazon ELB load balancer health checks.

However since we're using master-slave replication when we provision a new node, Solr starts up, and replication happens, but in the meantime /admin/ping return success before the index has replicated across from master and there are documents.

We'd like nodes to only be brought live once they have done the first replication and have documents. I don't see any way of doing this with /admin/ping PingRequestHandler - it always return success if the search succeeds, even with zero results.

Nor is there anyway of matching/not matching 开发者_如何学运维expected text in the response with the ELB health check configuration.

How can we achieve this?


To expand on the nature of the problem here, the PingRequestHandler will always return a success unless....

  1. Its query results in an exception being thrown.
  2. It is configured to use a healthcheck file, and that file is not found.

Thus my suggestion is that you configure the PingRequestHandler handler to use a healthcheck file. You can then use a cron job on your Solr system whose job is to check for the existence of documents and create (or remove) the healthcheck file accordingly. If the healthcheck file is not present, the PingRequestHandler will throw a HTTP 503 which should be sufficient for ELB.

The rough algorithm that I'd use...

  • Every minute, query http://localhost:8983/solr/select?q=*:*
  • If numDocs > 0 then touch /path/to/solr-enabled
  • Else rm /path/to/solr-enabled (optional, depending on your strictness)

The healthcheck file can be configured in the <admin> block, and you can use an absolute path, or a filename relative to the directory from which you have started Solr.

<admin>
  <defaultQuery>solr</defaultQuery>
  <pingQuery>q=*:*</pingQuery>
  <healthcheck type="file">/path/to/solr-enabled</healthcheck>
</admin>

Let me know how that works out! I'm tempted to implement something similar for read slaves at Websolr.


I ran into an interesting solution here: https://jobs.zalando.com/tech/blog/zookeeper-less-solr-architecture-aws/?gh_src=4n3gxh1

It's basically a servlet that you could add to the Solr webapp and then check all of the cores to make sure they have documents.

I'm toying with a more sophisticated solution but haven't tested it/made much progress yet: https://gist.github.com/er1c/e261939629d2a279a6d74231ce2969cf

What I like about this approach (in theory) is the ability to check the replication status/success for multiple cores. If anyone finds an actual implementation of this approach please let me know!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜