开发者

Solr only vs. Solr/MySQL solution

Currently I have a system, which is based solely on Solr. Which means, that I store all data in Solr (using SolrJ) with no other datastore involved. The problem is now, that I experience some performance issues. I thought, that it maybe could make sense to store in MySQL and then synchronize the 开发者_StackOverflow社区data with Solr with e.g. the DataImportHandler. So that I have the reading operations on the Solr index and the main writing operations in MySQL and then sometimes only Solr-Writing operations when synchronizing with Solr.

The thing is that I expect hundreds of millions documents which should be stored and I don't really now if that the MySQL/Solr makes sense.

Is there another better solution? Maybe Master-Solr for writing and Solr-slaves for reading?

Update: What I forgot to say is, that also in case of a schema.xml change, the "storing data in MySQL" solution could be useful in my opinion, because then I can re-commit all the data without caring about Solr's self-stored data.


Its not preferable to use the same Solr instance for both reading and writing as the activities (with commit and optimize) on Solr during writing would heavily impact the read operations.

Master - Slave confgurations would be nicer approach, with master primarily for writes and slaves for read only purposes.
Slaves being periodically refreshed with the contents from Master. (So there would be some delay)
You can always scale by adding multiple slaves.

Using MySQL as a persistant store with Master-Slave Solr would be a best approach.
MySQL providing a stable data store, and would guard you against index corruption or some more issues which would result in data lost.
Using dataimport handler you can do it easily with incremental updates, but there would be more time tag for latest data to appear on slaves.
With this you can also use Index swapping for full refreshes.

In case the index grows up hugh to be be maintainable and has performance impact, you may want to check solr shards.


I also thought about the same issue: storing everything in solr or stor in mySql and index in Solr.

I decided to go the 2nd way: store with MySQL and index in solr.

The reason: handling of data (reading and writing data) in MySql is much better than by Solr. Also data import/export from/to MySql is supported/possible by lots of tools, out of the box. Next Point: Backup. There are much more established ways for backing up an MySql DB than an Solr index.

Of course, for fulltext-search, Solr is much more better than MySql. So i decided, that everyone should have to work where he knows best. For your Information: i'm talking about an medium Index: 4GB for some million documents.

//Edit: don't forgett, that some features requiere stared data in lucene (not only indexed), like highlighting. If you need this, you have to store the documents in solr (additional). An alternative way could be implementing those features on client-side. (I did it this way)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜