Index not used where it can and should
I have a (big) table where I do a query on, using 3 fields in the WHERE
. One of these fields has an index (a date), and I'm looking for hits in the past 3 months. While it'll never be a quick query, the least I hope for is the index on this date to be used.
This is my query:
SELECT id
FROM statsTable
WHERE 1
AND ip = 'ipgoeshere'
AND anotherstring = 'Quite a long string goes here, something like this or even longer'
AND `date` > DATE_ADD( NOW( ) , INTERVAL -3 MONTH )
And it's explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE statsTable ALL date NULL NULL NULL 4833721 Using where; Using filesort
This is a complete table-scan, the number of rows is off because of INNODB-row-counting I guess, but that's all of em. This takes about 30 seconds.
If I force the index like so, I get the expected result:
SELECT id
FROM statsTable FORCE INDEX (date)
WHERE 1
AND ip = 'ipgoeshere'
AND anotherstring = 'Quite a long string goes here, something like this or even longer'
AND `date` > DATE_ADD( NOW( ) , INTERVAL -3 MONTH )
Again, the explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE statsTable range date date 8 NULL 1120172 Using where
Now we have 'only' a million results, but this gets done "lighting" quick (as in, 3 seconds instead of 30).
The table:
CREATE TABLE IF NOT EXISTS `statsTable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`date` datetime NOT NULL,
`ip` varchar(15) NOT NULL,
`anotherstring` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
KEY `date` (`date`)
) ENGINE=InnoDB;
The strange thing is: I have this table running on another database too (running on a different server), and the index IS being used in that instance. I can't see what could be the issue here. Is there a setting I misse开发者_StackOverflow中文版d? Or could it be some other minor difference? Apart from the differences, I can't see why above query wouldn't use the key.
I have run OPTIMIZE TABLE
and, as @DhruvPathak suggested ANALYZE TABLE
, but the explain still stays the same. I also tried an ALTER TABLE
as suggested by a friend, to rebuild the index. No luck.
The index is not used because the execution planner decides that it's best to full scan the table than use the index. This happens when the index is not selective enough for a query.
If the dates in your range check are more than 10-20% of the whole table, then the planner decides that scanning (sequentially) the whole table will be faster than using the index and retrieving the rows that fall in that range (this retrieval will not be sequential as the rows will be scattered all over the table).
That's why you see different behaviour with different sets of data.
For your query to work best, you can create index on:
(ip, yourDateField)
or
(anotherstring, yourDateField)
or
(ip, anotherstring, yourDateField)
I think the first option will be selective enough. No need to add the long VARCHAR(255)
field in an index. Alternatively, use the FORCE INDEX
that seems to work fine in your case.
Run ANALYZE TABLE once, and see if that helps in correcting the choice of the optimizer.
http://dev.mysql.com/doc/refman/5.0/en/analyze-table.html
This will also help : MySQL not using indexes with WHERE IN clause?
Can you try editing your query ?
Why is there a reduntant TRUE condition WHERE 1 in the query ?
Change
SELECT id
FROM statsTable
WHERE 1
AND ip = 'ipgoeshere'
AND anotherstring = 'Quite a long string goes here, something like this or even longer'
AND `date` > DATE_ADD( NOW( ) , INTERVAL -3 MONTH )
To
SELECT id
FROM statsTable
where `date` > DATE_ADD( NOW( ) , INTERVAL -3 MONTH )
AND ip = 'ipgoeshere'
AND anotherstring = 'Quite a long string goes here, something like this or even longer'
based on your query format, the ideal index should be on
ip, date
or
ip, date, anotherstring <-- this could be overkill
and
order by null <-- eliminate the file sort
lastly, it could be your another database contains far lesser record
精彩评论