开发者

GROUP BY query optimization

Database is MySQL with MyIS开发者_JAVA技巧AM engine.

Table definition:

CREATE TABLE IF NOT EXISTS  matches  (
   id  int(11) NOT NULL AUTO_INCREMENT,
   game  int(11) NOT NULL,
   user  int(11) NOT NULL,
   opponent  int(11) NOT NULL,
   tournament  int(11) NOT NULL,
   score  int(11) NOT NULL,
   finish  tinyint(4) NOT NULL,
  PRIMARY KEY ( id ),
  KEY  game  ( game ),
  KEY  user  ( user ),
  KEY  i_gfu ( game , finish , user )
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=3149047 ;

I have set an index on (game, finish, user) but this GROUP BY query still needs 0.4 - 0.6 seconds to run:

SELECT user AS player
     , COUNT( id ) AS times
FROM matches
WHERE finish = 1
  AND game = 19
GROUP BY user
ORDER BY times DESC

The EXPLAIN output:

| id | select_type | table   | type | possible_keys | key   | key_len | 
|  1 |  SIMPLE     | matches |  ref | game,i_gfu    | i_gfu |    5    | 

|  ref        |   rows |   Extra                                      |
| const,const | 155855 | Using where; Using temporary; Using filesort |

Is there any way I can make it faster? The table has about 800K records.


EDIT: I changed COUNT(id) into COUNT(*) and the time dropped to 0.08 - 0.12 seconds. I think I've tried that before making the index and forgot to change it again after.

In the explain output the Using index explains the speeding up:

|   rows |   Extra                                                   |
| 168029 | Using where; Using index; Using temporary; Using filesort |

(Side question: is this dropping of a factor of 5 normal?)

There are about 2000 users, so the final sorting, even if it uses filesort, it doesn't hurt performance. I tried without ORDER BY and it still takes almost same time.


Get rid of 'game' key - it's redundant with 'i_gfu'. As 'id' is unique count(id) just returns number of rows in each group, so you can get rid of that and replace it with count(*). Try it that way and paste output of EXPLAIN:

SELECT user AS player, COUNT(*) AS times
FROM matches
WHERE finish = 1
AND game = 19
GROUP BY user
ORDER BY times DESC


Eh, tough. Try reordering your index: put the user column first (so make the index (user, finish, game)) as that increases the chance the GROUP BY can use the index. However, in general GROUP BY can only use indexes if you limit the aggregate functions used to MIN and MAX (see http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html and http://dev.mysql.com/doc/refman/5.5/en/loose-index-scan.html). Your order by isn't really helping either.


One of the shortcomings of this query is that you order by an aggregate. That means that you can't return any rows until the full result set has been generated; no index can exist (for mysql myisam, anyway) to fix that.

You can denormalize your data fairly easily to overcome this, though; You could, for instance, add an insert/update trigger to stick a count value in a summary table, with an index, so that you can start returning rows immediately.


The EXPLAIN verifies the (game, finish, user) index was used in the query. That seems like the best possible index to me. Could it be a hardware issue? What is your system RAM and CPU?


I take it that the bulk of the time is spent on extracting and more importantly sorting (twice, including the one skipped by reading the index) 150k rows out of 800k. I doubt you can optimize it much more than it already is.


As others have noted, you may have reached the limit of your ability to tune the query itself. You should next see what the setting of max_heap_table_size and tmp_table_size variables in your server. The default is 16MB, which may be too small for your table.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜