开发者

Brutally slow MySQL Query

I'm happily taking any advice on this - be it rewriting the query, or setting up the tables differently.

What I have basically is three tables - a product table, a location table, and a condition table. The location tables store all the information about location throughout time, same with condition. The trick of this massive query is to pluck out the products with only their latest conditions and locations.

I took the general idea from this question: MySQL MIN/MAX returning proper value, but not the related record info

Is the answer just to store the current location and condition in the main product table, and keep these history tables, but not use them to search by? I like the idea of keeping them separate, but of course this query takes 50 seconds to run, which not practical at all.

开发者_运维百科SELECT 
'$table' AS tablename, 
$table.id, 
product_name, 
$table.status,
CL.event AS last_event,
CONCAT_WS(' ', CL.location, CL.floor, CL.bin, CL.bay) AS current_loc,
CC.status AS current_cond
FROM $table

LEFT OUTER JOIN
    (SELECT DISTINCT
        C.work_type,
        C.work_id,
        C.status,
        C.inspected_timestamp
        FROM
        (SELECT 
            CONCAT(work_type, work_id) AS condition_id, 
            status,
            MAX(inspected_timestamp) as current
            FROM conditions
            GROUP BY condition_id
        ) XC
    JOIN conditions C
      on CONCAT(C.work_type, C.work_id) = XC.condition_id
      and C.inspected_timestamp = XC.current
    ) CC ON 
    $table.id = CC.work_id AND 
    CC.work_type = '$table'                         

LEFT OUTER JOIN
    (SELECT DISTINCT
        L.work_type,
        L.work_id,
        L.event,
        L.location,
        L.floor,
        L.bin,
        L.bay,
        L.timestamp
        FROM
        (SELECT
            CONCAT(work_type, work_id) AS location_id, 
            location,
            MAX(timestamp) as current
            FROM locations
            GROUP BY location_id
        ) XL
    JOIN locations L
      on CONCAT(L.work_type, L.work_id) = XL.location_id
      and L.timestamp = XL.current
    ) CL ON 
    $table.id = CL.work_id AND 
    CL.work_type = '$table'

HAVING last_event = 'Received'

I am adding here the results of EXTENDED EXPLAIN.

[0] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => paintings 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1159 
    [filtered] => 100.00 
    [Extra] => )

[1] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3211 
    [filtered] => 100.00 
    [Extra] => ) 

[2] => Array ( 
    [id] => 1 
    [select_type] => PRIMARY 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1870 
    [filtered] => 100.00 
    [Extra] => ) 

[3] => Array ( 
    [id] => 4 
    [select_type] => DERIVED 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 1868 
    [filtered] => 100.00 
    [Extra] => Using temporary )

[4] => Array ( 
    [id] => 4 
    [select_type] => DERIVED 
    [table] => L 
    [type] => ref 
    [possible_keys] => timestamp 
    [key] => timestamp 
    [key_len] => 8 
    [ref] => XL.current 
    [rows] => 5 
    [filtered] => 100.00 
    [Extra] => Using where ) 

[5] => Array ( 
    [id] => 5 
    [select_type] => DERIVED 
    [table] => locations 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3913 
    [filtered] => 100.00 
    [Extra] => Using temporary; Using filesort ) 

[6] => Array ( 
    [id] => 2 
    [select_type] => DERIVED 
    [table] => 
    [type] => ALL 
    [possible_keys] => 
    [key] => 
    [key_len] => 
    [ref] => 
    [rows] => 3191 
    [filtered] => 100.00 
    [Extra] => Using temporary ) 

[7] => Array ( 
    [id] => 2 
    [select_type] => DERIVED 
    [table] => C 
    [type] => ref 
    [possible_keys] => inspected_timestamp 
    [key] => inspected_timestamp 
    [key_len] => 8 
    [ref] => XC.current 
    [rows] => 45 
    [filtered] => 100.00 
    [Extra] => Using where ) 

[8] => Array (
     [id] => 3 
    [select_type] => DERIVED 
    [table] => conditions 
    [type] => index 
    [possible_keys] => 
    [key] => work_type_2 
    [key_len] => 316 
    [ref] => 
    [rows] => 3986 
    [filtered] => 100.00 
    [Extra] => Using index; Using temporary; Using filesort )


There are a few things you can do:

  1. EXPLAIN PLAN on the query. See if there's a TABLE SCAN in there somewhere. That's the killer.
  2. See if rearranging the query makes a difference in the EXPLAIN PLAN results. Filtering more records out early will decrease the time needed.
  3. Check to make sure that columns in every WHERE clause have an index.
  4. The more records involved, the longer the query. How much history are you retaining? How many rows are you talking about? You should have a policy that removes records older than your retention cutoff and puts them in a history or reporting schema.
  5. Can you take advantage of triggers or views to pre-calculate any of these values?


I'm putting this in an answer purely due to the limit on comment length.

I looked at your query quite a while, and I think it is largely the nature of it, and the way it's been written that is causing the query to take so much time, but I don't see anything that seems obviously wrong either.

In the places where you are doing group bys in order to get a summary row, and then self joining those queries back, while I don't fully understand the design of your tables or the data, that is going to be costly, as the explain shows. So it is table scanning. You are also right that making temp tables and sorting those is even more costly.

So having those values pre-summarized and accessible in a summary table would help quite a bit if this is something where the time taken is simply unacceptable. When you look at the explain, please note the rows counts, as that should give you a good idea of whether or not what the query is doing is reasonable.

Also the having clause at the end by definition is not going to be optimized. If there is a way to move that to a where clause or as criteria in one of the joins, then you have a chance to improve the query plan significantly, but considering the cost of the summaries it will still take some time.

The only thing I can advise at this point is to break it down into small pieces and see if you can optimize the individual components and then reassemble.


As @gview explained, there are numerous things that are helping this query to being brutally slow. Besides all those mentioned in his answer, there is also the use of CONCAT() function in two tables where the results are later used to JOIN these two derived tables.

If you just want to show the rows of table product with only the latest related row in locationand latest related row in condition, you can use something like the following (this has only the logic for the latest condition, you'll need another similar LEFT JOIN for the latest location):

SELECT 
  t.id, 
  t.product_name, 
  t.status,
  cc.status AS current_cond
FROM 
      $table AS t
  LEFT OUTER JOIN
      ( SELECT c.*
        FROM 
              conditions AS c
          JOIN
              ( SELECT 
                  work_id, 
                  MAX(inspected_timestamp) as current_ts
                FROM conditions mc
                WHERE work_type = '$table'
                GROUP BY condition_id
              ) AS mc
            ON  mc.work_id = c.work_id
            AND mc.current_ts = c.inspected_timestamp 
        WHERE c.work_type = '$table'
      ) AS cc  
    ON cc.work_id = t.id                     
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜