MySQL slow query and EXPLAIN gave strange answers
I'm joining two tables. Table unique_nucleosome_re has about 600,0开发者_JAVA百科00 records. Another table has 20,000 records. The strange thing is the performance and the answer from EXPLAIN is different depending on the condition in the WHERE clause. When it was WHERE n.chromosome = 'X' it took about 3 minutes. When it was WHERE n.chromosome = '2L' it took more than 30 minutes and the connection is gone.
SELECT n.name , t.transcript_start , n.start
FROM unique_nucleosome_re AS n
INNER JOIN tss_tata_range AS t
ON t.chromosome = n.chromosome
WHERE (t.transcript_start > n.end AND t.ts_minus_250 < n.start )
AND n.chromosome = 'X'
ORDER BY t.transcript_start
;
This is the answer from EXPLAIN. when the WHERE is n.chromosome = 'X'
'1', 'SIMPLE', 'n', 'ALL', 'start_idx,end_idx,chromo_start', NULL, NULL, NULL, '606096', '48.42', 'Using where; Using join buffer'
when the WHERE is n.chromosome = '2L'
'1', 'SIMPLE', 'n', 'ref', 'start_idx,end_idx,chromo_start', 'chromo_start', '17', 'const', '68109', '100.00', 'Using where'
The number of records for X or 2L are almost the same. I spent last couple days but I couldn't figure it out. It may be a simple mistake I can't see or might be a bug. Could you help me?
First, without seeing any index information, I would have an index on your TSS_TData_Range on the Chromosome key and transcript_start (but a minimum of the chromosome key). I would also assume there is an index on chromosome on your unique_nucleosome_re table. Then, it appears the TSS is your SHORT table, so I would move THAT into the FIRST position of the query and invoke use of the "STRAIGHT_JOIN" clause...
SELECT STRAIGHT_JOIN
n.name,
t.transcript_start,
n.start
FROM
tss_tdata_range t,
unique_nucleosome_re n
where
t.chromosome = 'X'
and t.chromosome = n.chromosome
and t.transcript_start > n.end
and t.ts_minus_250 < n.start
order by
t.transcript_start
I'd be interested in the performance too if it works well for you...
精彩评论