Why an index can make a query really slow?

2023-02-26 21:21 问答作者：

Some day I answered a question on SO (accepted as correct), but the answer left me with a great doubt.

Shortly, user had a table with this fields:

id INT PRIMARY KEY
dt DATETIME (with an INDEX)
lt DOUBLE

The query SELECT DATE(dt),AVG(lt) FROM table GROUP BY DATE(dt) was really slow. We told him that (part of) the problem was using DATE(dt) as field and grouping, but db was on a production server and wasn't possible to split that field.

So (with a trigger) was inserted another field da DATE (with an INDEX) filled automatically with DATE(dt). Query SELECT da,AVG(lt) FROM table GROUP BY da was a bit fa开发者_高级运维ster, but with about 8mln records it took about 60s!!!

I tried on my pc and finally I discovered that, removing the index on field da query took only 7s, while using DATE(dt) after removing index it took 13s.

I've always thought an index on column used for grouping could really speed the query up, not the contrary (8 times slower!!!).

Why? Which is the reason?

Thanks a lot.

Because you still need to read all the data from both index + data file. Since you're not using any where condition - you always will have the query plan, that access all the data, row by row and you can do nothing with this.

If performance is important for this query and it is performed often - I'd suggest to cache the results into some temporary table and update it hourly (daily, etc).

Why it becomes slower: because in index data is already sorted and when mysql calculates cost of the query execution it thinks that it will be better to use already sorted data, then group it, then calculate agregates. But it is not in this case.

I think this is because of this or similiar MySQL bug: Index degrades sort performance and optimizer does not honor IGNORE INDEX

I remember the question as I was going to answer it but got distracted with something else. The problem was that his table design wasnt taking advantage of a clustered primary key index.

I would have re-designed the table creating a composite clustered primary key with the date as the leading part of the index. The sm_id field is still just a sequential unsigned int to guarantee uniqueness.

drop table if exists speed_monitor;
create table speed_monitor 
(
created_date date not null,
sm_id int unsigned not null,
load_time_secs double(10,4) not null default 0,
primary key (created_date, sm_id)
)
engine=innodb;

+------+----------+
| year | count(*) |
+------+----------+
| 2009 | 22723200 | 22 million
| 2010 | 31536000 | 31 million
| 2011 |  5740800 |  5 million
+------+----------+

select 
 created_date, 
 count(*) as counter,
 avg(load_time_secs) as avg_load_time_secs
from
 speed_monitor
where
 created_date between '2010-01-01' and '2010-12-31'
group by
 created_date
order by
 created_date
limit 7;

-- cold runtime

+--------------+---------+--------------------+
| created_date | counter | avg_load_time_secs |
+--------------+---------+--------------------+
| 2010-01-01   |   86400 |         1.66546802 |
| 2010-01-02   |   86400 |         1.66662466 |
| 2010-01-03   |   86400 |         1.66081309 |
| 2010-01-04   |   86400 |         1.66582251 |
| 2010-01-05   |   86400 |         1.66522316 |
| 2010-01-06   |   86400 |         1.66859480 |
| 2010-01-07   |   86400 |         1.67320440 |
+--------------+---------+--------------------+
7 rows in set (0.23 sec)

Why an index can make a query really slow?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？