开发者

Index not used where it can and should

I have a (big) table where I do a query on, using 3 fields in the WHERE. One of these fields has an index (a date), and I'm looking for hits in the past 3 months. While it'll never be a quick query, the least I hope for is the index on this date to be used.

This is my query:

SELECT id
FROM statsTable
WHERE 1
   AND ip            =  'ipgoeshere'
   AND anotherstring =  'Quite a long string goes here, something like this or even longer'
   AND `date`        >  DATE_ADD( NOW( ) , INTERVAL -3 MONTH ) 

And it's explain:

id  select_type table       type    possible_keys   key     key_len ref     rows    Extra
1   SIMPLE      statsTable  ALL     date            NULL    NULL    NULL    4833721 Using where; Using filesort   

This is a complete table-scan, the number of rows is off because of INNODB-row-counting I guess, but that's all of em. This takes about 30 seconds.

If I force the index like so, I get the expected result:

SELECT id
FROM statsTable FORCE INDEX (date)
WHERE 1
   AND ip            =  'ipgoeshere'
   AND anotherstring =  'Quite a long string goes here, something like this or even longer'
   AND `date`        >  DATE_ADD( NOW( ) , INTERVAL -3 MONTH ) 

Again, the explain:

id  select_type table       type    possible_keys   key     key_len ref     rows    Extra
1   SIMPLE      statsTable  range   date            date    8       NULL    1120172 Using where

Now we have 'only' a million results, but this gets done "lighting" quick (as in, 3 seconds instead of 30).

The table:

CREATE TABLE IF NOT EXISTS `statsTable` (

  `id`            int(11) unsigned NOT NULL AUTO_INCREMENT,
  `date`          datetime NOT NULL,
  `ip`            varchar(15) NOT NULL,
  `anotherstring` varchar(255) NOT NULL,

  PRIMARY KEY (`id`),
  KEY `date` (`date`)

) ENGINE=InnoDB;

The strange thing is: I have this table running on another database too (running on a different server), and the index IS being used in that instance. I can't see what could be the issue here. Is there a setting I misse开发者_StackOverflow中文版d? Or could it be some other minor difference? Apart from the differences, I can't see why above query wouldn't use the key.

I have run OPTIMIZE TABLE and, as @DhruvPathak suggested ANALYZE TABLE, but the explain still stays the same. I also tried an ALTER TABLE as suggested by a friend, to rebuild the index. No luck.


The index is not used because the execution planner decides that it's best to full scan the table than use the index. This happens when the index is not selective enough for a query.

If the dates in your range check are more than 10-20% of the whole table, then the planner decides that scanning (sequentially) the whole table will be faster than using the index and retrieving the rows that fall in that range (this retrieval will not be sequential as the rows will be scattered all over the table).

That's why you see different behaviour with different sets of data.


For your query to work best, you can create index on:

(ip, yourDateField)

or

(anotherstring, yourDateField)

or

(ip, anotherstring, yourDateField)

I think the first option will be selective enough. No need to add the long VARCHAR(255) field in an index. Alternatively, use the FORCE INDEX that seems to work fine in your case.


Run ANALYZE TABLE once, and see if that helps in correcting the choice of the optimizer.

http://dev.mysql.com/doc/refman/5.0/en/analyze-table.html

This will also help : MySQL not using indexes with WHERE IN clause?

Can you try editing your query ?

Why is there a reduntant TRUE condition WHERE 1 in the query ?

Change

SELECT id
FROM statsTable
WHERE 1
   AND ip            =  'ipgoeshere'
   AND anotherstring =  'Quite a long string goes here, something like this or even longer'
   AND `date`        >  DATE_ADD( NOW( ) , INTERVAL -3 MONTH ) 

To

SELECT id
FROM statsTable
where  `date`        >  DATE_ADD( NOW( ) , INTERVAL -3 MONTH ) 
AND ip            =  'ipgoeshere'
AND anotherstring =  'Quite a long string goes here, something like this or even longer'


based on your query format, the ideal index should be on

ip, date

or

ip, date, anotherstring <-- this could be overkill

and

order by null <-- eliminate the file sort

lastly, it could be your another database contains far lesser record

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜