MYSQL Convert rows to columns performance problem
I am doing a query that converts rows to columns similar to this post but have encountered a performance problem. Here is the query:-
SELECT
Info.Customer,
Answers.Answer,
Answers.AnswerDescription,
Details.Code1,
Details.Code2,
Details.Code3
FROM
Info
LEFT OUTER JOIN Answers
ON Info.AnswerID = Answers.AnswerID
LEFT OUTER JOIN
(SELECT
ReferenceNo,
MAX(CASE DetailsIndicator WHEN 'cde1' THEN DetailsCode ELSE NULL END ) Code1,
MAX(CASE DetailsIndicator WHEN 'cde2' THEN DetailsCode ELSE NULL END ) Code2,
MAX(CASE DetailsIndicator WHEN 'cde3' THEN DetailsCode ELSE NULL END ) Code3
FROM Deta开发者_如何转开发ilsData
GROUP BY ReferenceNo) Details
ON Info.ReferenceNo = Details.ReferenceNo
WHERE Info.Date BETWEEN x AND y
There are less than 300 rows returned, but the Details table is about 180 thousand rows. The query takes 45 seconds to run in half the time. Note the inner query takes 7 seconds to run.
When I type show processlist;
into MYSQL it is hanging on "Sending Data".
Any thoughts as to what the performance problem might be?
First, the inner subselect that queries the "Details" result is querying against ALL entries... is that what you want? I don't think so. You only appear to want results based on a reference number that qualifies from the outer date check.
So, I would change your inner select to include...
FROM DetailsData
where DetailsData.ReferenceNo IN
( select distinct Info2.ReferenceNo
from Info Info2 where Info2.Date between x and y )
GROUP BY ReferenceNo) Details
This way, you are only getting details associated with reference numbers within the date range in question.
Well as a start I would move the where info.date and join of Details and Info inside the subquery.
Also, are the Details.ReferenceNo and Info.Date indexed?
EDIT: Well, here's the version that I had in mind
SELECT
Details.Customer,
Answers.Answer,
Answers.AnswerDescription,
Details.Code1,
Details.Code2,
Details.Code3
FROM
(SELECT
Info.Customer,
Info.AnswerID,
Info.ReferenceNo,
MAX(CASE DetailsIndicator WHEN 'cde1' THEN DetailsCode ELSE NULL END ) Code1,
MAX(CASE DetailsIndicator WHEN 'cde2' THEN DetailsCode ELSE NULL END ) Code2,
MAX(CASE DetailsIndicator WHEN 'cde3' THEN DetailsCode ELSE NULL END ) Code3
FROM Info LEFT JOIN DetailsData ON Info.ReferenceNo = Details.ReferenceNo
WHERE Info.Date BETWEEN x AND y
GROUP BY ReferenceNo) Details
LEFT OUTER JOIN Answers ON Details.AnswerID = Answers.AnswerID
So the first iteration is to reduce the number of records generated in the subquery (reason: the source of it has several hundred records and there is aggregation on it, so mysql has to build a resultset for further joining) - in that sense DRapp's solution and this one are similar.
However, the question is if it is really necessary to have a subquery at all? It would be intresting to see how would the following perform
SELECT
Info.Customer,
Answers.Answer,
Answers.AnswerDescription,
MAX(CASE DetailsIndicator WHEN 'cde1' THEN DetailsCode ELSE NULL END ) Code1,
MAX(CASE DetailsIndicator WHEN 'cde2' THEN DetailsCode ELSE NULL END ) Code2,
MAX(CASE DetailsIndicator WHEN 'cde3' THEN DetailsCode ELSE NULL END ) Code3
FROM
FROM Info
LEFT JOIN DetailsData ON Info.ReferenceNo = DetailsData.ReferenceNo
LEFT OUTER JOIN Answers ON Details.AnswerID = Answers.AnswerID
WHERE Info.Date BETWEEN x AND y
GROUP BY ReferenceNo
Given the fact that there are only hundreds records in the result and if there are indexes covering joins and select condition I would expect the above to return results in less then a second on a modest hardware.
(queries untested)
Try use "EXPLAIN", to define which query is slow
精彩评论