Help me change this single complex query to use temporary tables
About the system: - There are tutors who create classes and packs - A tags based search approach is being followed.Tag relations are created when new tutors register and when tutors create packs (this makes tutors and packs searcheable). For details please check the section How tags work in this system? below.
Following is the concerned query Can anybody help me suggest an approach using temporary tables. We have indexed all the relevant fields and it looks like this is the least time possible with this approach:-
SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%"
OR tt.tag LIKE "%Dictatorship%"
OR ttt.tag LIKE "%Dictatorship%" )) AS key_1_total_matches
,
SUM(DISTINCT( t.tag LIKE "%democracy%"
OR tt.tag LIKE "%democracy%"
OR ttt.tag LIKE "%democracy%" )) AS key_2_total_matches
,
COUNT(DISTINCT( od.id_od )) AS
tutor_popularity,
CASE
WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1
AND wc.wc_type = 0
AND wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE'
OR wccp.country_code IN ( 'INT' )
) ), 0)
) THEN 1
ELSE 0
END AS 'classes_published'
,
CASE
WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE'
OR lpcp.country_code IN ( 'INT' )
) ), 0)
) THEN 1
ELSE 0
END AS 'packs_published',
td . *,
u . *
FROM tutor_details AS td
JOIN users AS u
ON u.id_user = td.id_user
LEFT JOIN learning_packs_tag_relations AS lptagrels
ON td.id_tutor = lptagrels.id_tutor
LEFT JOIN learning_packs AS lp
ON lptagrels.id_lp = lp.id_lp
LEFT JOIN learning_packs_categories AS lpc
ON lpc.id_lp_cat = lp.id_lp_cat
LEFT JOIN learning_packs_categories AS lpcp
ON lpcp.id_lp_cat = lpc.id_parent
LEFT JOIN learning_pack_content AS lpct
ON ( lp.id_lp = lpct.id_lp )
LEFT JOIN webclasses_tag_relations AS wtagrels
ON td.id_tutor = wtagrels.id_tutor
LEFT JOIN webclasses AS wc
ON wtagrels.id_wc = wc.id_wc
LEFT JOIN learning_packs_categories AS wcc
ON wcc.id_lp_cat = wc.id_wp_cat
LEFT JOIN learning_packs_categories AS wccp
ON wccp.id_lp_cat = wcc.id_parent
LEFT JOIN order_details AS od
ON td.id_tutor = od.id_author
LEFT JOIN orders AS o
ON od.id_order = o.id_order
LEFT JOIN tutors_tag_relations AS ttagrels
ON td.id_tutor = ttagrels.id_tutor
LEFT JOIN tags AS t
ON t.id_tag = ttagrels.id_tag
LEFT JOIN tags AS tt
ON tt.id_tag = lptagrels.id_tag
LEFT JOIN tags AS ttt
ON ttt.id_tag = wtagrels.id_tag
WHERE ( u.country = 'IE'
OR u.country IN ( 'INT' ) )
AND CASE
WHEN ( ( tt.id_tag = lptagrels.id_tag )
AND ( lp.id_lp > 0 ) ) THEN lp.id_status = 1
AND lp.published = 1
AND lpcp.status = 1
AND ( lpcp.country_code = 'IE'
OR lpcp.country_code IN (
'INT'
) )
ELSE 1
END
AND CASE
WHEN ( ( ttt.id_tag = wtagrels.id_tag )
AND ( wc.id_wc > 0 ) ) THEN wc.wc_api_status = 1
AND wc.wc_type = 0
AND
wc.class_date > '2010-06-01 22:00:56'
AND wccp.status = 1
AND ( wccp.country_code = 'IE'
OR wccp.country_code IN (
'INT'
) )
ELSE 1
END
AND CASE
WHEN ( od.id_od > 0 ) THEN od.id_author = td.id_tutor
AND o.order_status = 'paid'
AND CASE
WHEN ( od.id_wc > 0 ) THEN od.can_attend_class = 1
ELSE 1
END
ELSE 1
END
AND ( t.tag LIKE "%Dictatorship%"
OR t.tag LIKE "%democracy%"
OR tt.tag LIKE "%Dictatorship%"
OR tt.tag LIKE "%democracy%"
OR ttt.tag LIKE "%Dictatorship%"
OR ttt.tag LIKE "%democracy%" )
GROUP BY td.id_tutor
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
ORDER BY tutor_popularity DESC,
u.surname ASC,
u.name ASC
LIMIT 0, 20
The problem
The results returned by the above query are correct (AND logic working as per expectation), but the time taken by the query rises alarmingly for heavier data and for the current data I have it is like 10 seconds as against normal query timings of the order of 0.005 - 0.0002 seconds, which makes it totally unusable.
Somebody suggested in my previous question to do the following:-
- create a temporary table and insert here all relevant data that might end up in the final result set
- run several updates on this table, joining the required tables one at a time instead of all of them at the same time
- finally perform a query on this temporary table to extract the end result
All this was done in a stored procedure, the end result has passed unit tests, and is blazing fast.
I have never worked with temporary tables till now. Only if I could get some hints, kind of schematic representations so that I can start with...
Is there something faulty with the query? What can be the reason behind 10+ seconds of execution time?
How tags work in this system?
- When a tutor registers, tags are entered and tag relations are created with respect to tutor's details like name, surname etc.
- When a Tutors create packs, again tags are entered and tag relations are created wit开发者_如何转开发h respect to pack's details like pack name, description etc.
- tag relations for tutors stored in tutors_tag_relations and those for packs stored in learning_packs_tag_relations. All individual tags are stored in tags table.
Temporary tables are not a silver bullet. The fundamental problem with your queries lies with patterns like this:
t.tag LIKE "%Dictatorship%"
OR tt.tag LIKE "%Dictatorship%"
OR ttt.tag LIKE "%Dictatorship%"
Wildcarding the left side of a LIKE comparison guarantees that an index can not be used. Effectively, you're table scanning all three tables involved...
You need to leverage Full Text Searching, either MySQL's native FTS or 3rd party stuff like Sphinx. All the FTS I've known include a scoring/rank value indicating the strength of the match - you can read the MySQL documentation for the algorithm details. But the score/rank is not the same as what you've got: SUM(DISTINCT LIKE...)
, you could get the same using something like:
SELECT t.id_tag,
COUNT(*) AS num_matches
FROM TABGS
WHERE MATCH(tag) AGAINST ('Dictatorship')
GROUP BY t.id_tag
精彩评论