Search implementation dilemma: full text vs. plain SQL
I have a MySQL/Rails app that needs search. Here's some info about the data:
Users search within their own data only, so searches are narrowed down by user_id to begin with.
Each user will have up to about five thousand records (they accumulate over time).
I wrote out a typical user's records to a text file. The file size is 2.9 MB.
Search has to cover two columns:
title
andbody
.title
is a varchar(255) column.body
is column type text.This will be lightly used. If I average a few searches per second that would be surprising.
It's running an a 500 MB CentOS 5 VPS machine.
I don't want relevance ranking or any kind of fuzziness. Searches should be for exact strings and reliably return all records containing the string. Simple date order -- newest to oldest.
I'm using the InnoDB table type.
I'm looking at pl开发者_如何转开发ain SQL search (through the searchlogic gem) or full text search using Sphinx and the Thinking Sphinx gem.
Sphinx is very fast and Thinking Sphinx is cool, but it adds complexity, a daemon to maintain, cron jobs to maintain the index.
Can I get away with plain SQL search for a small scale app?
I think plain SQL search won't be the good choice. Because of when we fetching text type columns in MySQL the request is always falling to hard drive no matter what cache settings are.
You can use plain SQL search only with very small apps.
I'd prefer Sphinx for that.
I would start out simple -- chances are that plain SQL will work well, and you can always switch to full text search later if the search function proves to be a bottleneck.
I'm developing and maintaining an application with a search function with properties similar to yours, and plain SQL search has worked very well for me so far. I had similar performance concerns when I first implemented the search function a year or two ago, but I haven't seen any performance problems whatsoever yet.
Having used MySQL fulltext search for about 4 years, and just moving now to Sphinx, I'd say that a regular MySQL search using the fulltext boolean (ie exact) syntax will be fine. It's fast and it will do exactly what you want. The amount of data you will be searching at any one time will be small.
The only problem might be ordering the results. MySQL's fulltext search can get slow when you start ordering things by (eg) date, as that requires that you search the entire table, rather than just the first nn results it finds. That was ultimately the reason I moved to Sphinx.
Sphinx is also awesome, so don't be afraid to try it, but it sounds like the additional functionality may not be required in your case.
精彩评论