开发者

Optimizing UNION MySQL queries where two queries use a lot of the same tables

I am using the UNION in MySQL in order to bring together the results of two separate queries. These two queries use a lot of the same tables. Is there anyway to take advantage of this in order to optimize the query?

    SELECT  2 AS RELEVANCE_SCORE_TYPE,
        VIEWER_ID, 
        QUESTION_ID, 
        sum(ANSWER_SCORE) AS RELEVANCE_SCORE
FROM(SELECT  cr.COMMUNICATIONS_ID AS ANSWER_ID, 
        cr.CONSUMER_ID as VIEWER_ID,
        nc.PARENT_COMMUNICATIONS_ID AS QUESTION_ID,
        case when  cr.CONSUMER_ID= nc.SENDER_CONSUMER_ID then 3*((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
            else ((24/(((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(cal.LAST_MOD_TIME)+3600)/3600))*(ces.EXPERT_SCORE * cirm.CONSUMER_RATING) + (12.5 * scs.SIMILARITY)* (1 - EXP(-0.5 * (cal.TIPS_AMOUNT / ATV.AVG_TIPS)) + .15)))
        end as ANSWER_SCORE
FROM (SELECT 238 AS CONSUMER_ID, 
             ACTION_LOG_ID, 
            COMMUNICATIONS_ID 
     FROM consumer_action_log 
     WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc 
    ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal 
    ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim 
    ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm 
    ON cr.CONSUMER_ID=cirm.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces 
    ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON cr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR cr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
    (
        SELECT AVG(cal.TIPS_AMOUNT) AS AVG_TIPS
        FROM CONSUMER_ACTION_LOG AS cal
        JOIN (SELECT 234 AS CONSUMER_ID, 
                     ACTION_LOG_ID, 
                     COMMUNICATIONS_ID 
              FROM consumer_action_log 
              WHERE COMM_TYPE_ID=4) AS cr 
        ON cal.SENDER_CONSUMER_ID=cr.consumer_id
    ) ATV) AS ASM
GROUP BY ANSWER_ID
UNION
SELECT 1 AS RELEVANCE_SCORE_TYPE,
       qcr.CONSUMER_ID AS Viewer_ID, 
       qcr.COMMUNICATIONS_ID, 
       case when reply.replies IS NOT NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(-reply.replies))
       when reply.replies IS NULL AND qcr.CONSUMER_ID <> qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(ces.EXPERT_SCORE+2.5*scs.SIMILARITY)*(EXP(0))
       when reply.replies IS NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(0))
       when reply.replies IS NOT NULL AND qcr.CONSUMER_ID = qcr.SENDER_CONSUMER_ID then
       24/((UNIX_TIMESTAMP(NOW())-UNIX_TIMESTAMP(qcr.LAST_MOD_TIME)+3600)/3600)*(7.5)*(1-EXP(-reply.replies))
            else null
       end as QUESTION_SCORE
FROM (SELECT 238 AS CONSUMER_ID,
            SENDER_CONSUMER_ID, 
            COMMUNICATIONS_ID,
            LAST_MOD_TIME
     FROM network_communications
     WHERE NETWORK_COMM_TYPE_ID=1) AS qcr
JOIN communication_interest_mapping AS cim 
    ON qcr.COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_expert_score AS ces
    ON ces.CONSUMER_ID=qcr.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON qcr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR qcr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND qcr.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
LEFT JOIN (SELECT COUNT(*) AS replies, 
           PARENT_COMMUNICATIONS_ID
           FROM network_communications AS nc1
           WHERE NETWORK_COMM_TYPE_ID=2
GROUP BY PARENT_COMMUNICATIONS_ID) AS reply
    ON qcr.COMMUNICATIONS_ID=reply.PARENT_COMMUNICATIONS_ID
ORDER BY RELEVANCE_SCORE DESC;
FROM (SELECT 234 AS CONSUMER_ID, 
             ACTION_LOG_ID, 
            COMMUNICATIONS_ID 
     FROM consumer_action_log 
     WHERE COMM_TYPE_ID=4) AS cr
JOIN network_communications AS nc 
    ON cr.COMMUNICATIONS_ID=nc.COMMUNICATIONS_ID
JOIN consumer_action_log AS cal 
    ON cr.ACTION_LOG_ID=cal.ACTION_LOG_ID
JOIN communication_interest_mapping AS cim 
    ON nc.PARENT_COMMUNICATIONS_ID=cim.COMMUNICATION_ID
JOIN consumer_interest_rating_mapping AS cirm 
    ON cr.CONSUMER_ID=cirm.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=cirm.CONSUMER_INTEREST_ID
JOIN consumer_expert_score AS ces 
    ON nc.SENDER_CONSUMER_ID=ces.CONSUMER_ID
    AND cim.CONSUMER_INTEREST_EXPERT_ID=ces.CONSUMER_EXPERT_ID
JOIN survey_customer_similarity AS scs 
    ON cr.CONSUMER_ID=scs.CONSUMER_ID_2 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_1 
    OR cr.CONSUMER_ID=scs.CONSUMER_ID_1 
    AND cal.SENDER_CONSUMER_ID=scs.CONSUMER_ID_2
CROSS JOIN
    (
        SELECT AVG(cal.T开发者_JAVA技巧IPS_AMOUNT) AS AVG_TIPS
        FROM CONSUMER_ACTION_LOG AS cal
        JOIN (SELECT 234 AS CONSUMER_ID, 
                     ACTION_LOG_ID, 
                     COMMUNICATIONS_ID 
              FROM consumer_action_log 
              WHERE COMM_TYPE_ID=4) AS cr 
        ON cal.SENDER_CONSUMER_ID=cr.consumer_id
    ) ATV) AS ASM
GROUP BY ANSWER_ID
ORDER BY ANSWER_SCORE_SUMMED DESC;

It's long, so don't feel obligated to read the whole thing. The gist is simply that the queries on both sides of the union utilizes a lot of the same tables.


My first thought was that it's not worth trying to out-think the MySQL optimizer. Especially since you're doing 18 joins and 2 cross joins.

It's not unusual for queries on both sides of a UNION to use a lot of the same tables. It's the kind of situation I'd expect the optimizer to be able to deal with.

To get better performance, you'll need to look at the execution profile and, probably, rewrite the query to eliminate the cross joins. To look at the execution profile, run one of these two.

EXPLAIN <your query>
EXPLAIN EXTENDED <your query>


Looks there are some subtle differences between the top and bottom. So don't think you can easily rewrite without the UNION. However, if you're certain the data from the top and bottom queries will not overlap use UNION ALL, where UNION attempts to make the results distinct making the engine do additional work, which can be considerable if there are a lot of records.

As per @Catcall's suggestion, using EXPLAIN will validate the UNION ALL approach.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜