MySQL: Optimizing Searches with LIKE or FULLTEXT
I am building a forum and I am looking for the proper way to build a search feature that finds users by their name or by the title of their posts. What I have come up with is this:
SELECT users.id, users.user_name, users.user_picture
FROM users, subject1, subject2
WHERE users.id = subject1.user_id
AND users.id = subject2.user_id
AND (users.user_name LIKE '%{$keywords}%'
OR subject1.title1 LIKE '%{$keywords}%'
OR subject2.title2 LIKE '%{$keywords}%')
ORDER BY users.user_name ASC
LIMIT 10
OFFSET {$offset}
The LIMIT and the OFFSET is for pagination. My question is, would doing a LIKE search through multiple tables greatly slow down performance when the number of rows reach a significant amount?
I have a few alternatives: One, perhaps I can rewrite that query to have the LIKE searches done inside a subquery that only returns indexed user_ids. Then, I would find the remaining user information based on that. Would that increase performance by much?
Second, I suppose I can have the $keyword
string appear before the first wildcard as in LIKE {$keyword}%
. This way, I can index the user_name, title1, and title2
columns. However, since I will be trading accuracy for speed here, how much of a difference in performance would this make? Will it be worth sacrificing this much accuracy to index these columns?
Third, perhaps I can give users 3 search fields to choose from, and have each search through only one table. Would this increase performance by much?
Lastly, should I consider using a FULLTEXT search instead of LIKE? What are the performance differences between the two? Also, my tables are usi开发者_运维技巧ng the InnoDB storage engine, and I am not able to use the FULLTEXT index unless I switch to MyISAM. Will there be any major differences in switching to MyISAM?
Pagination is another performance issue I am worried about, because in order to do pagination, I would need to find the total number of results the query returns. At the moment, I am basically doing the query I just mentioned TWICE because the first time it is used only to COUNT
the results.
There are two things in your query that will prevent MySql from using indexes firstly your patterns start with a wildcard %
, MySql can't use indexes to search for patterns that start with a wildcard, secondly you have OR
in your WHERE
clause you need to rewrite your query using UNION
to avoid using OR which also prevents MySql from using indexes. Without using an index MySql needs to do a full table scan every time and the time needed for that will increase linearly as the number of rows grow in your table and yes as you put it "it would greatly slow down performance when the number of rows reach a significant amount" so I'd say your only real scalable option is to use FULLTEXT search.
Most of your questions are explained here: http://use-the-index-luke.com/sql/where-clause/searching-for-ranges/like-performance-tuning
InnoDB/fulltext indexing is announced for MySQL 5.6, but that will probably not help you right now.
How about starting with EXPLAIN <select-statement>
? http://dev.mysql.com/doc/refman/5.6/en/explain.html
Switching to MyISAM should work seemlessly. The only downside is, that MyISAM is locking the whole table upon inserts/updates, which can be slow down tables with many more inserts than selects. Basically a rule of thumb in my opinion is to use MyISAM when you don't need foreign keys and the table has far more selects than inserts and use InnoDB when the table has far more inserts/updates than selects (e.g. for a statistic table).
In your case I guess switching to MyISAM is the better choice as a fulltext index is way more powerful and faster.
It also delivers the possibilty to use certain query modifiers like excluding words ("cat -dog
") or similar. But keep in mind that it's not possible to look for words ending with a phrase anymore like with a LIKE-search ("*bar
"). "foo*
" will work though.
精彩评论