开发者

MySQL Explain: what's causing 'Using temporary; Using filesort'

I'm planning on creating a view using this SQL SELECT, but the explain for it shows it's using temporary and using filesort. I can't figure out what indices I need in order to fix this problem. Mostly, I'm wondering why it's using filesort intead of using an index to sort.

Here are my tables:

CREATE TABLE `learning_signatures` (
  `signature_id` int(11) NOT NULL AUTO_INCREMENT,
  `signature_file` varchar(100) NOT NULL,
  `signature_md5` varchar(32) NOT NULL,
  `image_file` varchar(100) NOT NULL,
  PRIMARY KEY (`signature_id`),
  UNIQUE KEY `unique_signature_md5` (`signature_md5`)
) ENGINE=InnoDB AUTO_INCREMENT=640 DEFAULT CHARSET=latin1

CREATE TABLE `learning_user_suggestions` (
  `user_suggestion_id` int(11) NOT NULL AUTO_INCREMENT,
  `signature_id` int(11) NOT NULL,
  `ch` char(1) NOT NULL,
  `time_suggested` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `user_id` int(11) NOT NULL,
  PRIMARY KEY (`user_suggestion_id`),
  KEY `char_index` (`ch`),
  KEY `ls_sig_id_indx` (`signature_id`),
  KEY `user_id_indx` (`user_id`),
  KEY `sig_char_indx` (`signature_id`,`ch`)
) ENGINE=InnoDB AUTO_INCREMENT=1173 DEFAULT CHARSET=latin1

And here is the problematic SQL statement I'm planning on using in my view:

select ls.signature_id, ls.signature_file, ls.signature_md5, ls.image_file, sug.ch , count(sug.ch) AS suggestion_count
from (`learning_signatures` `ls` left join `learning_user_suggestions` `sug` on(ls.signature_id = sug.signature_id))
group by ls.signature_id, sug.ch;

Output from explain:

id  select_type table   type    possible_keys                   key             key_len ref                 rows    Extra
1   SIMPLE      ls      ALL     NULL     开发者_运维百科                       NULL            NULL    NULL                514     "Using temporary; Using filesort"
1   SIMPLE      sug     ref     ls_sig_id_indx,sig_char_indx    ls_sig_id_indx  4       wwf.ls.signature_id 1

Another example, this time using a where clause:

explain select ls.signature_id, ls.signature_file, ls.signature_md5, ls.image_file, sug.ch , count(sug.ch) AS suggestion_count
from (`learning_signatures` `ls` left join `learning_user_suggestions` `sug` on(ls.signature_id = sug.signature_id))
WHERE signature_md5 = '75f8a5b1176ecc2487b90bacad9bc4c'
group by ls.signature_id, sug.ch;

Explain output:

id  select_type table   type    possible_keys                key                    key_len ref     rows    Extra
1   SIMPLE      ls      const   unique_signature_md5         unique_signature_md5   34      const   1       "Using temporary; Using filesort"
1   SIMPLE      sug     ref     ls_sig_id_indx,sig_char_indx ls_sig_id_indx         4       const   1   


In your first query, what you do is join your signatures table with user suggestions, getting lots of rows, and then group results using some columns from user suggestions. But there is no index for the joined table to help with grouping as it would have to be defined on previously joined table. What you should instead do is try to create a derived table from user suggestions that is already groupped by ch and signature_id and then join it:

SELECT ls.signature_id, ls.signature_file, ls.signature_md5, ls.image_file, 
       sug.ch, sug.suggestion_count
FROM learning_signatures ls
LEFT JOIN 
  (SELECT s.signature_id, s.ch, count(s.ch) as suggestion_count
    FROM learning_user_suggestions s 
    GROUP BY s.signature_id, s.ch ) as sug
ON ls.signature_id = sug.signature_id

Optimizer should be able now to use your sig_char_indx index for groupping, the derived table will be not bigger than your signatures table and you join both using unique column. You will still have to do a full scan over signatures table, but that cannot be avoided because you are selecting all of it anyway.

As for the second query, if you want to restrict signatures to a single one just append

WHERE ls.signature_md5='75f8a5b1176ecc2487b90bacad9bc4c'

to the end of previous query and group by only s.ch, because only one signature_id will match your md5 anyway. Optimizer should now use md5 index for where and char_index for grouping.


Maybe it'll help if you create an index on learning_signatures that contains both signature_md5 and signature_id (and in that order)

`KEY `md5_id` (`signature_md5`,`signature_id`)?

I'm no MySQL expert but I found that created keys that encapsulate both the where clause and the join clause usually helps get rid of temporary and filesort


Use indexes. Find the fields where you need them by using EXPLAIN on your queries.

If you however have a mostly write-only DB (few reads), you may want to refrain from using indexes, as they may have a negative impact on your write performance.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜