开发者

mysql subquery strangely slow

I have a query to select from another sub-query select. While the two queries look almost the same the second query (in this sample) runs much slower:

SELECT
   user.id
  ,user.first_name
  -- user.*
  FROM user
  WHERE
    user.id IN (SELECT ref_id 
                  FROM education 
                 WHERE ref_type='user' 
                   AND education.institute_id='58' 
                   AND education.institute_type='1'
                );

This query takes 1.2s Explain on this query results:

 id select_type table   type    possible_keys   key key_len ref rows    Extra
 1  PRIMARY         user   index    first_name  152 141192  Using where; Using index
 2  DEPENDENT SUBQUERY  education   index_subquery  ref_type,ref_id,institute_id,institute_type,ref_type_2  ref_id  4   func    1   Using where

The second query:

SELECT
  -- user.id
  -- user.first_name
  user.*
  FROM user
  WHERE
    user.id IN (SELECT ref_id 
                  FROM education 
                 WHERE ref_type='user' 
                   AND education.institute_id='58' 
                   AND education.institute_type='1'
                );

Takes 45se开发者_如何学JAVAc to run, with explain:

 id select_type table   type    possible_keys   key key_len ref rows    Extra
 1  PRIMARY user    ALL                 141192  Using where
 2  DEPENDENT SUBQUERY  education   index_subquery  ref_type,ref_id,institute_id,institute_type,ref_type_2  ref_id  4   func    1   Using where

Why is it slower if i query only by index fields? Why both queries scans the full length of the user table? Any ideas how to improve?

Thanks.


I'm not sure why it chooses to use the index when you select only two columns but not when you select all columns, but it is better to select only the columns you need anyway. Also it might be better to try a JOIN instead of a subquery:

SELECT
    user.id
    user.first_name
FROM user
JOIN education 
ON user.id = education.ref_id 
AND education.ref_type='user' 
AND education.institute_id='58' 
AND education.institute_type='1'


I have had several occasions where replacing "WHERE foo in (subquery)" with dumping the results of the subquery into a temporary table and using an inner join seriously improved the performance. (Like, a 6.5 minute query turning into a sub-second query.)

Which, er, is what Mark Byers just said.


This is what I think happens:

The query planner will turn the query into an inner join, which gives the database the freedom to start from either table when filtering out the result.

When you only select a few fields from the user table, the result from both tables are small, so the database can choose which table will filter the other from what's most efficient depending on what indexes can be used.

When you fetch all data from the user table, you are forcing it to use the education table to filter the user table as the intermediate result would be too large the other way around. There is no index that fits for matching that way, so you get a table scan which slows down the query.

(Excuse me if some of the terminology is coloured from SQL Server, that's what I regularly use.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜