Mysql NOT IN operator performance problems for large result sets?
I have the following two queries:
select count(*)
from segmentation_cycle_recipients scr
, segmentation_instance si
where si.access_code=scr.access_code
and si.segment_id is NOT NULL;
Returns 13429 rows in 0.2 secs
2)
select count(*)
from segmentation_cycle_recipients scr
, segmentation_instance si, web_pat_info wpi
where si.access_code=scr.access_code and scr.siebel_row_id=wpi.siebel_id
and si.segment_id is NOT NULL;
Returns 4003 rows in 0.48 secs
Now, I want 1)-2) So I wrote the following query:
select 开发者_StackOverflow中文版 count(*)
from segmentation_cycle_recipients scr
, segmentation_instance si
where si.access_code=scr.access_code
and si.segment_id is NOT NULL
and scr.siebel_row_id NOT IN (select scr.siebel_row_id
from segmentation_cycle_recipients scr
, segmentation_instance si
, web_pat_info wpi where si.access_code=scr.access_code
and scr.siebel_row_id=wpi.siebel_id and si.segment_id is NOT NULL);
I was expecting 13429-4003=9426 rows but the query takes forever (have to kill the query command) to execute. It even adds a counter in the "slow queries" listing in mysql>status;)
It returns < 100ms in development environment where the result set is much smaller. So I believe the query itself is right.
I believe, using NOT IN is a known performance problem in Mysql (Oracle has the MINUS operator). Any suggestions on how to improve the performance on this query?
SELECT COUNT(*)
FROM segmentation_cycle_recipients scr
JOIN segmentation_instancs si
ON si.access_code = scr.access_code
LEFT JOIN
web_pat_info wpi
ON wpi.siebel_id = scr.siebel_row_id
WHERE wpi.siebel_id IS NULL
AND si.segment_id is NOT NULL
Make sure that si.access_code
and wpi.siebel_id
are indexed, and wpi.siebel_id
is defined as NOT NULL
.
If the latter condition does not hold, replace wpi.siebel_id IS NULL
in the WHERE
clause with any other column defined as NOT NULL
.
You may be better served using the NOT EXISTS
clause.
select count(*)
from segmentation_cycle_recipients scr
, segmentation_instance si
where si.access_code=scr.access_code
and si.segment_id is NOT NULL
and NOT EXISTS (select scr2.siebel_row_id
from segmentation_cycle_recipients scr2
, segmentation_instance si2
, web_pat_info wpi2 where si2.access_code=scr2.access_code
and scr2.siebel_row_id=wpi2.siebel_id and si2.segment_id is NOT NULL
and scr.siebel_row_id=scr2.siebel_row_id);
精彩评论