开发者

MYSQL Convert rows to columns performance problem

I am doing a query that converts rows to columns similar to this post but have encountered a performance problem. Here is the query:-

SELECT
    Info.Customer,
    Answers.Answer,
    Answers.AnswerDescription,
    Details.Code1,
    Details.Code2,
    Details.Code3
FROM
    Info
    LEFT OUTER JOIN Answers
    ON Info.AnswerID = Answers.AnswerID
    LEFT OUTER JOIN
    (SELECT
      ReferenceNo,
      MAX(CASE DetailsIndicator WHEN 'cde1' THEN DetailsCode ELSE NULL END ) Code1,
      MAX(CASE DetailsIndicator WHEN 'cde2' THEN DetailsCode ELSE NULL END ) Code2,
      MAX(CASE DetailsIndicator WHEN 'cde3' THEN DetailsCode ELSE NULL END ) Code3
    FROM Deta开发者_如何转开发ilsData
    GROUP BY ReferenceNo) Details
    ON Info.ReferenceNo = Details.ReferenceNo
    WHERE Info.Date BETWEEN x AND y

There are less than 300 rows returned, but the Details table is about 180 thousand rows. The query takes 45 seconds to run in half the time. Note the inner query takes 7 seconds to run.

When I type show processlist; into MYSQL it is hanging on "Sending Data".

Any thoughts as to what the performance problem might be?


First, the inner subselect that queries the "Details" result is querying against ALL entries... is that what you want? I don't think so. You only appear to want results based on a reference number that qualifies from the outer date check.

So, I would change your inner select to include...

FROM DetailsData 
where DetailsData.ReferenceNo IN 
    ( select distinct Info2.ReferenceNo
        from Info Info2 where Info2.Date between x and y )
GROUP BY ReferenceNo) Details 

This way, you are only getting details associated with reference numbers within the date range in question.


Well as a start I would move the where info.date and join of Details and Info inside the subquery.

Also, are the Details.ReferenceNo and Info.Date indexed?

EDIT: Well, here's the version that I had in mind

SELECT
    Details.Customer,
    Answers.Answer,
    Answers.AnswerDescription,
    Details.Code1,
    Details.Code2,
    Details.Code3
FROM
    (SELECT
      Info.Customer,
      Info.AnswerID,
      Info.ReferenceNo,
      MAX(CASE DetailsIndicator WHEN 'cde1' THEN DetailsCode ELSE NULL END ) Code1,
      MAX(CASE DetailsIndicator WHEN 'cde2' THEN DetailsCode ELSE NULL END ) Code2,
      MAX(CASE DetailsIndicator WHEN 'cde3' THEN DetailsCode ELSE NULL END ) Code3
    FROM Info LEFT JOIN DetailsData ON Info.ReferenceNo = Details.ReferenceNo
    WHERE Info.Date BETWEEN x AND y
    GROUP BY ReferenceNo) Details
    LEFT OUTER JOIN Answers ON Details.AnswerID = Answers.AnswerID

So the first iteration is to reduce the number of records generated in the subquery (reason: the source of it has several hundred records and there is aggregation on it, so mysql has to build a resultset for further joining) - in that sense DRapp's solution and this one are similar.

However, the question is if it is really necessary to have a subquery at all? It would be intresting to see how would the following perform

SELECT
    Info.Customer,
    Answers.Answer,
    Answers.AnswerDescription,
    MAX(CASE DetailsIndicator WHEN 'cde1' THEN DetailsCode ELSE NULL END ) Code1,
    MAX(CASE DetailsIndicator WHEN 'cde2' THEN DetailsCode ELSE NULL END ) Code2,
    MAX(CASE DetailsIndicator WHEN 'cde3' THEN DetailsCode ELSE NULL END ) Code3
FROM
    FROM Info 
    LEFT JOIN DetailsData ON Info.ReferenceNo = DetailsData.ReferenceNo
    LEFT OUTER JOIN Answers ON Details.AnswerID = Answers.AnswerID
WHERE Info.Date BETWEEN x AND y
GROUP BY ReferenceNo

Given the fact that there are only hundreds records in the result and if there are indexes covering joins and select condition I would expect the above to return results in less then a second on a modest hardware.

(queries untested)


Try use "EXPLAIN", to define which query is slow

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜