Mysql - help me optimize this query (improved question)
About the system: - There are tutors who create classes and packs - A tags based search approach is being followed.Tag relations are created when new tutors register and when tutors create packs (this makes tutors and packs searcheable). For details please check the section How tags work in this system? below.
Following is the concerned query
SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) AS key_1_total_matches,
SUM(DISTINCT( t.tag LIKE "%democracy%" )) AS key_2_total_matches,
COUNT(DISTINCT( od.id_od )) AS tutor_popularity,
CASE
WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1
AND wc.wc_type = 0
AND wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE'
OR wccp.country_code IN ( 'INT' )
) ), 0)
) THEN 1
ELSE 0
END AS 'classes_published',
CASE
WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE'
OR lpcp.country_code IN ( 'INT' )
) ), 0)
) THEN 1
ELSE 0
END AS 'packs_published',
td . *,
u . *
FROM tutor_details AS td
JOIN users AS u
ON u.id_user = td.id_user
LEFT JOIN learning_packs_tag_relations AS lptagrels
ON td.id_tutor = lptagrels.id_tutor
LEFT JOIN learning_packs AS lp
ON lptagrels.id_lp = lp.id_lp
LEFT JOIN learning_packs_categories AS lpc
ON lpc.id_lp_cat = lp.id_lp_cat
LEFT JOIN learning_packs_categories AS lpcp
ON lpcp.id_lp_cat = lpc.id_parent
LEFT JOIN learning_pack_content AS lpct
ON ( lp.id_lp = lpct.id_lp )
LEFT JOIN webclasses_tag_relations AS wtagrels
ON td.id_tutor = wtagrels.id_tutor
开发者_开发百科 LEFT JOIN webclasses AS wc
ON wtagrels.id_wc = wc.id_wc
LEFT JOIN learning_packs_categories AS wcc
ON wcc.id_lp_cat = wc.id_wp_cat
LEFT JOIN learning_packs_categories AS wccp
ON wccp.id_lp_cat = wcc.id_parent
LEFT JOIN order_details AS od
ON td.id_tutor = od.id_author
LEFT JOIN orders AS o
ON od.id_order = o.id_order
LEFT JOIN tutors_tag_relations AS ttagrels
ON td.id_tutor = ttagrels.id_tutor
JOIN tags AS t
ON ( t.id_tag = ttagrels.id_tag )
OR ( t.id_tag = lptagrels.id_tag )
OR ( t.id_tag = wtagrels.id_tag )
WHERE ( u.country = 'IE'
OR u.country IN ( 'INT' ) )
AND CASE
WHEN ( ( t.id_tag = lptagrels.id_tag )
AND ( lp.id_lp > 0 ) ) THEN lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE'
OR lpcp.country_code IN (
'INT'
) )
ELSE 1
END
AND CASE
WHEN ( ( t.id_tag = wtagrels.id_tag )
AND ( wc.id_wc > 0 ) ) THEN wc.wc_api_status = 1
AND wc.wc_type = 0
AND
wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE'
OR wccp.country_code IN (
'INT'
) )
ELSE 1
END
AND CASE
WHEN ( od.id_od > 0 ) THEN od.id_author = td.id_tutor
AND o.order_status = 'paid'
AND CASE
WHEN ( od.id_wc > 0 ) THEN od.can_attend_class = 1
ELSE 1
END
ELSE 1
END
AND ( t.tag LIKE "%Dictatorship%"
OR t.tag LIKE "%Democracy%" )
GROUP BY td.id_tutor
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
ORDER BY tutor_popularity DESC,
u.surname ASC,
u.name ASC
LIMIT 0, 20
The problem
The results returned by the above query are correct (AND logic working as per expectation), but the time taken by the query rises alarmingly for heavier data and for the current data I have it is like 25 seconds as against normal query timings of the order of 0.005 - 0.0002 seconds, which makes it totally unusable.
It is possible that some of the delay is being caused because all the possible fields have not yet been indexed. The tag field of tags table is indexed. Is there something faulty with the query? What can be the reason behind 20+ seconds of execution time?
How tags work in this system?
- When a tutor registers, tags are entered and tag relations are created with respect to tutor's details like name, surname etc.
- When a Tutors create packs, again tags are entered and tag relations are created with respect to pack's details like pack name, description etc.
- tag relations for tutors stored in tutors_tag_relations and those for packs stored in learning_packs_tag_relations. All individual tags are stored in tags table.
The explain query output:- Please see this screenshot - http://www.test.examvillage.com/Explain_query.jpg
You may see if it helps adding indexes on following fields:
lptagrels.id_tutor
wtagrels.id_tutor
od.id_author
and then try to get rid of the case-when
structures from the where clause.
You can add some table specific restrictions directly to join like: left join t2 on t1.id = t2.id AND ...
to make code more readable.
EDIT: Seems you have a wrong approach here: you search for all tags and then count the tags that match search. Instead you should search for tags that match and then count results that have these tags.
I've recently had a similar problem. I had to modify a query to implement a new feature, and that meant adding several joins and left joins. The logic was correctly implemented, but it took forever with some bigger tables.
The solution was a complete rewrite, as Brian suggests.
My new approach was something like this:
- create a temporary table and insert here all relevant data that might end up in the final result set
- run several updates on this table, joining the required tables one at a time instead of all of them at the same time
- finally perform a query on this temporary table to extract the end result
All this was done in a stored procedure, the end result has passed unit tests, and is blazing fast.
UPDATE
Please test this query, to see if it returns the same results as the original. If it does, then I will further try to find a better implementation.
SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) AS key_1_total_matches,
SUM(DISTINCT( t.tag LIKE "%democracy%" )) AS key_2_total_matches,
COUNT(DISTINCT( od.id_od )) AS tutor_popularity,
(wc.id_wc > 0
AND wc.wc_api_status = 1
AND wc.wc_type = 0
AND wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE' OR wccp.country_code IN ( 'INT' ))
) AS 'classes_published',
(lp.id_lp > 0
AND lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE' OR lpcp.country_code IN ( 'INT' ) )
) AS 'packs_published',
td . *,
u . *
FROM tutor_details AS td JOIN users AS u ON u.id_user = td.id_user
LEFT JOIN order_details AS od ON td.id_tutor = od.id_author
LEFT JOIN orders AS o ON od.id_order = o.id_order
LEFT JOIN learning_packs_tag_relations AS lptagrels ON td.id_tutor = lptagrels.id_tutor --
LEFT JOIN learning_packs AS lp ON lptagrels.id_lp = lp.id_lp
LEFT JOIN learning_packs_categories AS lpc ON lpc.id_lp_cat = lp.id_lp_cat
LEFT JOIN learning_packs_categories AS lpcp ON lpcp.id_lp_cat = lpc.id_parent
LEFT JOIN learning_pack_content AS lpct ON ( lp.id_lp = lpct.id_lp )
LEFT JOIN webclasses_tag_relations AS wtagrels ON td.id_tutor = wtagrels.id_tutor --
LEFT JOIN webclasses AS wc ON wtagrels.id_wc = wc.id_wc
LEFT JOIN learning_packs_categories AS wcc ON wcc.id_lp_cat = wc.id_wp_cat
LEFT JOIN learning_packs_categories AS wccp ON wccp.id_lp_cat = wcc.id_parent
LEFT JOIN tutors_tag_relations AS ttagrels ON td.id_tutor = ttagrels.id_tutor --
JOIN tags AS t ON ( t.id_tag = ttagrels.id_tag )
OR ( t.id_tag = lptagrels.id_tag )
OR ( t.id_tag = wtagrels.id_tag )
WHERE ( u.country = 'IE' OR u.country IN ( 'INT' ) )
AND (NOT ( t.id_tag = lptagrels.id_tag AND lp.id_lp > 0)
or (lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE' OR lpcp.country_code IN ('INT') )
)
)
AND (not (t.id_tag = wtagrels.id_tag AND wc.id_wc > 0 )
or (
wc.wc_api_status = 1
AND wc.wc_type = 0
AND wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE' OR wccp.country_code IN ('INT' ) )
)
)
AND (NOT (od.id_od > 0)
OR (
od.id_author = td.id_tutor
AND o.order_status = 'paid'
AND (NOT (od.id_wc > 0) OR od.can_attend_class = 1)
)
)
AND ( t.tag LIKE "%Dictatorship%" OR t.tag LIKE "%Democracy%" )
GROUP BY td.id_tutor
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
ORDER BY tutor_popularity DESC,
u.surname ASC,
u.name ASC
LIMIT 0, 20
Take this code out the back and shoot it.
Then start again.
I'm not being flippant but this is horrific and you would do yourself and anyone else touching it in the future a big favour by getting rid of it right now.
精彩评论