开发者

MySQL: How do I speed up a "Count()" query with a "JOIN" and "order_by"?

I have the following two (simplified for the sake of example) tables in my MySQL db:

DESCRIBE appname_item;

-----------------+---------------+------+-----+---------+----------------+
| Field           | Type          | Null | Key | Default | Extra          |
+-----------------+---------------+------+-----+---------+----------------+
| id              | int(1开发者_运维问答1)       | NO   | PRI | NULL    | auto_increment |
| name            | varchar(200)  | NO   |     | NULL    |                |
+-----------------+---------------+------+-----+---------+----------------+

DESCRIBE appname_favorite;

+---------------+----------+------+-----+---------+----------------+
| Field         | Type     | Null | Key | Default | Extra          |
+---------------+----------+------+-----+---------+----------------+
| id            | int(11)  | NO   | PRI | NULL    | auto_increment |
| user_id       | int(11)  | NO   | MUL | NULL    |                |
| item_id       | int(11)  | NO   | MUL | NULL    |                |
+---------------+----------+------+-----+---------+----------------+

I'm trying to get a list of items ordered by the number of favorites. The query below works, however there are thousands of records in the Item table, and the query is taking up to a couple of minutes to complete.

SELECT `appname_item`.`id`, `appname_item`.`name`, COUNT(`appname_favorite`.`id`) AS `num_favorites` 
FROM `appname_item` 
LEFT OUTER JOIN `appname_favorite` ON (`appname_item`.`id` = `appname_favorite`.`item_id`) 
GROUP BY `appname_item`.`id`, `appname_item`.`name` 
ORDER BY `num_favorites` DESC;

Here are the results of EXPLAIN, which provides some insight as to why the query is so slow (type "ALL", "using temporary", and "using filesort" should all be avoided if possible.)

+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+
| id | select_type | table              | type | possible_keys               | key                         | key_len | ref                           | rows | Extra                           |
+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+
|  1 | SIMPLE      | appname_item       | ALL  | NULL                        | NULL                        | NULL    | NULL                          |  574 | Using temporary; Using filesort |
|  1 | SIMPLE      | appname_favorite   | ref  | appname_favorite_67b70d25   | appname_favorite_67b70d25   | 4       | appname.appname_item.id       |    1 |                                 |
+----+-------------+--------------------+------+-----------------------------+-----------------------------+---------+-------------------------------+------+---------------------------------+

I know that the easiest way to optimize the query is to add an Index, but I can't seem to figure out how to add an Index for a Count() query that involves a JOIN and an order_by. I should also mention that I am running this through the Django ORM, so would prefer to not change the sql query and just work on fixing and fine tuning the db to run the query in the most effective way.

I've been trying to figure this out for a while, so any help would be much appreciated!

UPDATE

Here are the indexes that are already in the db:

+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table              | Non_unique | Key_name                    | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| appname_favorite   |          0 | PRIMARY                     |            1 | id          | A         |         594 |     NULL | NULL   |      | BTREE      |         |
| appname_favorite   |          1 | appname_favorite_fbfc09f1   |            1 | user_id     | A         |          12 |     NULL | NULL   |      | BTREE      |         |
| appname_favorite   |          1 | appname_favorite_67b70d25   |            1 | item_id     | A         |         594 |     NULL | NULL   |      | BTREE      |         |
+--------------------+------------+-----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+


Actually you can't avoid filesort because the count is determined at the calculation time and is unknown in the index. The only solution I can imagine is to create a composite index for table appname_item, which may help a little or not, depending on your particular data:

ALTER TABLE appname_item ADD UNIQUE INDEX `item_id_name` (`id` ASC, `name` ASC);


There is nothing wrong with your query - it looks good.

It could be the the optimizer has out-of-date info about the table. Try running this:

ANALYZE TABLE <tableaname>;

for all tables involved.


Firstly, for the count() function, you can check this answer to know more detail: https://stackoverflow.com/a/2710630/1020600

For example, using MySQL, count(*) will be fast under a MyISAM table but slow under an InnoDB. Under InnoDB you should use count(1) or count(pk)

If your storage engines is MYISAM and if you want to count on row (i guess so), use count(*) is enough.

From your EXPLAIN, I found there's no Key for appname_item, if i try to add a condition

where `appname_item`.`id` = `appname_favorite`.`item_id`

then the "key" appears. so funny but it's work.

The final sql like this

explain SELECT `appname_item`.`id`, `appname_item`.`name`, COUNT(*) AS `num_favorites` 
FROM `appname_item` 
LEFT OUTER JOIN `appname_favorite` ON (`appname_item`.`id` = `appname_favorite`.`item_id`) 
where `appname_item`.`id` = `appname_favorite`.`item_id`
GROUP BY `appname_item`.`id`, `appname_item`.`name` 
ORDER BY `num_favorites` DESC;

+----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key
| key_len | ref | rows | Extra
| +----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+ | 1 | SIMPLE | appname_favorite | index | item_id | item_id | 5 | NULL | 2312 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | appname_item | eq_ref | PRIMARY | PRIMARY | 4 | test.appname_favorite.item_id | 1 | Using where
| +----+-------------+------------------+--------+---------------+---------+---------+-------------------------------+------+----------------------------------------------+

On my computer, table appname_item has 1686 rows and appname_favorite has 2312 rows, old sql takes from 15 to 23ms. new sql takes 3.7 to 5.3ms

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜