开发者

SQL left join query runs VERY slow

Basically I'm trying to pull a random poll question that a user has not yet responded to from a database. This query takes about 10-20 seconds to execute, which is obviously no good! The responses table is about 30K rows and the database also has about 300 questions.

SELECT开发者_运维技巧  questions.id
FROM  questions
LEFT JOIN  responses ON ( questions.id = responses.questionID
AND responses.username =  'someuser' ) 
WHERE
responses.username IS NULL 
ORDER BY RAND() ASC 
LIMIT 1

PK for questions and reponses tables is 'id' if that matters.

Any advice would be greatly appreciated.


You most likely need an index on

responses.questionID
responses.username 

Without the index searching through 30k rows will always be slow.


Here's a different approach to the query which might be faster:

SELECT q.id
FROM questions q
WHERE q.id NOT IN (
    SELECT r.questionID
    FROM responses r
    WHERE r.username = 'someuser'
)

Make sure there is an index on r.username and that should be pretty quick.

The above will return all the unanswered questios. To choose the random one, you could go with the inefficient (but easy) ORDER BY RAND() LIMIT 1, or use the method suggested by Tom Leys.


The problem is probably not the join, it's almost certainly sorting 30k rows by order rand()


See: Do not order by rand

He suggests (replace quotes in this example with your query)

SELECT COUNT(*) AS cnt FROM quotes

-- generate random number between 0 and cnt-1 in your programming language and run 
-- the query:

SELECT quote FROM quotes LIMIT $generated_number, 1

Of course you could probably make the first statement a subselect inside the second.


Is OP even sure the original query returns the correct result set?

I assume the "AND responses.username = 'someuser'" clause was added to join specification with intention that join will then generate null rightside columns for only the id's that someuser has not answered.

My question: won't that join generate null rightside columns for every question.id that has not been answered by all users? The left join works such that, "If any row from the target table does not match the join expression, then NULL values are generated for all column references to the target table in the SELECT column list."

In any case, nickf's suggestion looks good to me.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜