Solr vs. MySQL performance for autocomplete
In one of our applications, we need to hold some plain tabular data and we need to be able to perform user-side autocompletion on one of the columns.
The initial solution we came up with, was to couple MySQL with Solr to achieve this (MySQL to hold data and Solr to just hold the tokenized column and return ids as result). But something unpleasant happened recently (developers started开发者_开发知识库 storing some of the data in Solr, because the MySQL table and the operations done on it are nothing that Solr can not provide) and we thought maybe we could merge them together and eliminate one of the two.
So we had to either: (1) move all the data to Solr (2) use MySQL for autocompletion
(1) sounded terrible so I gave it a shot with (2), I started with loading that single column's data into MySQL, disabled all caches on both MySQL and Solr, wrote a tiny webapp that is able to perform very similar queries [1] on both databases, and fired up a few JMeter scenarios against both in a local and similar environment. The results show a 2.5-3.5x advantage for Solr, however, I think the results may be totally wrong and fault prone.
So, what would you suggest for:
- Correctly benchmarking these two systems, I believe you need to provide similar[to MySQL] environment to JVM.
- Designing this system.
Thanks for any leads.
[1] SELECT column FROM table WHERE column LIKE 'USER-INPUT%'
on MySQL and column:"USER-INPUT"
on Solr.
I recently moved a website over from getting its data from the database (postgres) to getting all data from Solr. Unbelievable difference in speed. We also have autocomplete for Australian suburbs (about 15K of them) and it finds them in a couple of milliseconds, so the ajax auto-complete (we used jQuery) reacts almost instantly.
All updates are done against the original database, but our site is a mostly-read site. We used triggers to fire events when records were updated and that spawns a reindex into Solr of the record.
The other big speed improvement was pre-caching data required to render the items - ie we denormalize data and pre-calculate lots of stuff at Solr indexing time so the rendering is easy for the web guys and super fast.
Another advantage is that we can put our site into read-only mode if the database needs to be taken offline for some reason - we just fall back to Solr. At least the site doesn't go fully down.
I would recommend using Solr as much as possible, for both speed and scalability.
精彩评论