Improving MySQL tables with Indexes
I am very new to Indexes in MySQL. I know, I should probably have leart it earlier, but most projects been small enough for me to get away with out it ;)
So, now I am testing it. I did my test by running EXPLAIN
on a query:
Query:
EXPLAIN SELECT a . *
FROM `tff__keywords2data` AS a
LEFT JOIN `tff__keywords` AS b ON a.keyword_id = b.id
WHERE (
b.keyword = 'dog' || b.keyword = 'black' || b.keyword = 'and' || b.keyword = 'white'
)
GROUP BY a.data_id
HAVING COUNT( a.data_id ) =4
First, without indexes 开发者_JAVA技巧I got these results:
Then, with index on data_id and keyword_id i got this:
So as I understand, the number of rows MySQL has to search goes from 61k down to 10k which must be good right?
So my question is, am I correct here? And is there anything else I could think about when trying to optimize?
UPDATE:
Further more, after some help from AJ and Piskvor pointing out my other table and its column keyword not having index I got this:
Great improvement! Right?
As you see, the key
used for table b
is still NULL
. You may want to add an index on b.keyword
and match with
WHERE b.keyword IN ('dog','black','and','white')
This is functionally different from your WHERE
clause, although it returns the same results.
As it looks, you may be interested in fulltext searching.
Depending on what you want to achieve, you should either replace the LEFT JOIN
with the INNER JOIN
or move your WHERE
condition into the ON
clause:
As it is now:
SELECT a.*
FROM `tff__keywords2data` AS a
LEFT JOIN
`tff__keywords` AS b
ON b.id = a.keyword_id
WHERE b.keyword = 'dog' || b.keyword = 'black' || b.keyword = 'and' || b.keyword = 'white'
GROUP BY
a.data_id
HAVING COUNT( a.data_id ) = 4
your query is in fact an INNER
join (since you have non-null conditions in the WHERE
clause).
Also, instead of using bit arithmetics (which is not sargable) you should use native OR
or IN
constructs:
SELECT a.*
FROM `tff__keywords2data` AS a
JOIN `tff__keywords` AS b
ON b.id = a.keyword_id
WHERE b.keyword IN ('dog', 'black', 'and', 'white')
GROUP BY
a.data_id
HAVING COUNT(*) = 4
You may also want to create an index on ttf__keywords (keyword)
which can filter on the keywords
you are searching for and make less records to be selected from the leading b
.
Finally, if you don't need implicit ordering on a.data_id
, get rid of it by appending ORDER BY NULL
:
SELECT a.*
FROM `tff__keywords2data` AS a
JOIN `tff__keywords` AS b
ON b.id = a.keyword_id
WHERE b.keyword IN ('dog', 'black', 'and', 'white')
GROUP BY
a.data_id
HAVING COUNT(*) = 4
ORDER BY
NULL
This will remove filesort
from your plan.
Yep thats improved (but from quickly looking i think can be more improved). what you can see is that the query optimiser is now seeing AND USING keyword_id index. it has reduced the rows its searching from 64283 down to 10216. but this is still using a filesort which hopefully someone else can clarify is similar to a SQL Server table scan? which isn't good... i could be wrong there though.
You should be able to now reduce the rows from table b down below 10216
You're doing a string comparison to b.keyword....add an index there.
Use an INNER JOIN
instead of a LEFT JOIN
. A left join will return unmatched rows in the join table which I don't think you need here.
Try putting indexes on everything in a WHERE clause, and anything in a JOIN, so that would be:
a.keyword_id b.id b.keyword
You may also want to try adding an index to a.data_id, as it's in a "GROUP BY". Too many indexes is usually not a problem, unless you're adding large volumes of data to large tables - that can cause INSERTs to be very slow.
精彩评论