开发者

Help me optimize this query

I have this query for an application that I am designing. There is a table of references, an authors table and a reference_authors table. There is a sub query to return all authors for a g开发者_JS百科iven reference which I then display formatted in php. The subquery and query run individually are both nice and speedy. However as soon as the subquery is put into the main query the whole thing takes over 120s to run. I would apprecaite some fresh eyes on this one. Thanks.

SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
(SELECT GROUP_CONCAT(a.author_name)
FROM authors_final AS a
INNER JOIN reference_authors AS ra2 ON ra2.author_id = a.author_id
WHERE ra2.reference_id = rf.reference_id
GROUP BY ra2.reference_id) AS authors
FROM
references_final AS rf
INNER JOIN reference_authors AS ra ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id;

Here is the fixed query. Thanks guys for the recommendations.

SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
GROUP_CONCAT(a.author_name) AS authors
FROM
references_final as rf
INNER JOIN (reference_authors AS ra INNER JOIN authors_final AS a ON ra.author_id = a.author_id)
ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id
GROUP BY rf.reference_id


Although not every subquery can be rewritten as an inner join, I think yours can.

From 120 seconds to 78 milliseconds is not a bad improvement--about three orders of magnitude. Take the rest of the day off.

When you come back tomorrow, start looking for other subqueries in your source code.


You say the subquery is nice and speedy in isolation but its now obviously running for every single row - 100 rows = 100 sub queries.

Assuming you have indexes on all your foreign keys that's as good as it gets as a sub query.

One option is to left join authors and create a Cartesian product - you'll have a lot more rows returned and will need some code to get to the same end result but it will put less strain on the db and will run quicker.

If you've got paging on and say are returning 10 rows, issung 10 individual calls to get the authors in isolation would also be be pretty quick.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜