Query Takes too long to Complete
I have a table with 900k+ records
it takes a minute or more to run this query:
SELECT
t.user_id,
SUM(t.direction = "i") AS 'num_in',
SUM(t.direction = "o") AS 'num_out'
FROM tbl_user_reports t
WHERE t.bound_time BETWEEN '2011-02-01' AND '2011-02-28'
GROUP BY t.user_id
HAVING t.user_id IS NOT NULL
ORDER BY num_in DESC
LIMIT 10;
can you tell me how to query the result faster?
-- more info -- structure:
id int(11) unsigned NOT NULL
subscriber varchar(255) NULL
user_id int(11) unsigned NULL
carrier_id int(11) unsigned NOT NULL
pool_id int(11) unsigned NOT NULL
service_id int(11) unsigned NOT NULL
persona_id int(11) unsigned NULL
inbound_id int(11) unsigned NULL
outbound_id int(11开发者_StackOverflow中文版) unsigned NULL
bound_time datetime NOT NULL
direction varchar(1) NOT NULL
indexes:
bound_timebound_time
FK_tbl_user_reportspersona_id
FK_tbl_user_reports_messageinbound_id
FK_tbl_user_reports_serviceservice_id
FK_tbl_user_reports_poolpool_id
FK_tbl_user_reports_useruser_id
FK_tbl_user_reports_carriercarrier_id
FK_tbl_user_reports_subscribersubscriber
FK_tbl_user_reports_outboundoutbound_id
directiondirection
You may want to try a compound index on
(bound_time, user_id, direction)
Contains all the fields you need and can be narrowed by the date range very efficiently.
If possible redesign your report table to take more advantage of your innodb clustered primary key index.
Here's a simplified example of what i mean:
5 million rows 32K users 126K records in date range
cold runtime (after mysqld restart) = 0.13 seconds
create table user_reports
(
bound_time datetime not null,
user_id int unsigned not null,
id int unsigned not null,
direction tinyint unsigned not null default 0,
primary key (bound_time, user_id, id) -- clustered composite PK
)
engine=innodb;
select count(*) as counter from user_reports;
+---------+
| counter |
+---------+
| 5000000 |
+---------+
select count(distinct(user_id)) as counter from user_reports;
+---------+
| counter |
+---------+
| 32000 |
+---------+
select count(*) as counter from user_reports
where bound_time between '2011-02-01 00:00:00' and '2011-04-30 00:00:00';
+---------+
| counter |
+---------+
| 126721 |
+---------+
select
t.user_id,
sum(t.direction = 1) AS num_in,
sum(t.direction = 0) AS num_out
from
user_reports t
where
t.bound_time between '2011-02-01 00:00:00' and '2011-04-30 00:00:00' and
t.user_id is not null
group by
t.user_id
order by
direction desc
limit 10;
+---------+--------+---------+
| user_id | num_in | num_out |
+---------+--------+---------+
| 17397 | 1 | 1 |
| 14729 | 2 | 1 |
| 20094 | 4 | 1 |
| 19343 | 7 | 1 |
| 24804 | 1 | 2 |
| 14714 | 3 | 2 |
| 2662 | 4 | 3 |
| 16360 | 2 | 3 |
| 21288 | 2 | 3 |
| 12800 | 6 | 2 |
+---------+--------+---------+
10 rows in set (0.13 sec)
explain
select
t.user_id,
sum(t.direction = 1) AS num_in,
sum(t.direction = 0) AS num_out
from
user_reports t
where
t.bound_time between '2011-02-01 00:00:00' and '2011-04-30 00:00:00' and
t.user_id is not null
group by
t.user_id
order by
direction desc
limit 10;
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref |rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | t | range | PRIMARY | PRIMARY | 8 | NULL |255270 | Using where; Using temporary; Using filesort |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+----------------------------------------------+
1 row in set (0.00 sec)
hope you find this helpful :)
As Thilo said add indexes, also instead tbl_user_reports t
use tbl_user_reports AS t
, I would move HAVING statement to WHERE to reduce amount of operations.
WHERE t.user_id IS NOT NULL AND t.bound_time BETWEEN '2011-02-01' AND '2011-02-28'
UPDATE For experiment purpose you can try to use like instead of between
t.bound_time LIKE '2011-02%'
精彩评论