开发者

How to optimize this sql delete statement

I have 3 tables, the first one, table1, has as primary key the id column, the second table (table2) has a column table1_id that refer as foreign key to the table1.id, the third table (table3) has, as table2, a column开发者_Python百科 table1_id that refer as foreign key to table1.id.

I have to delete from table1 all the rows where table1.id is not in table2.table1_id and not in table3.table1_id

now i am using this query:

DELETE FROM table1
WHERE  table1.id IN (SELECT table1.id
                     FROM   (table2
                             RIGHT OUTER JOIN table1
                               ON table2.table1_id = table1.id)
                            LEFT OUTER JOIN table3
                              ON table3.table1_id = table1.id
                     WHERE  table2.table1_id IS NULL
                            AND table3.table1_id IS NULL);  

but it is very slow, it takes a lot of time, there are some better approach to this delete statement?

If this can help i can assume that table2 has more data that table3.

The database i am using is Apache Derby.

Thanks for the help.


Assuming you got the obvious covered (indices created for table1.id, table2.table1_id and table3.table1_id), you don't need to perform full outer joins just to test if a key is in another table, you can use subqueries and exists() -- or not exists() in your case.

And since you're only testing for existence, you can use the following pattern:

where not exists ( select top 1 1 from... where... )


DELETE     table1
FROM       table1
LEFT JOIN  table2 ON table1.id = table2.table1_id
LEFT JOIN  table3 ON table1.id = table3.table1_id
WHERE table2.table1_id IS NULL
  AND table3.table1_id IS NULL


Do you know how many rows you are deleting? I agree with @Blindy, that not exists would probably be better in your case if Derby supports it (I don't know Derby so I can't say for sure). However, if there are a lot of records being deleted, you might want to do this in batches. Deleting a 10,000,000 records is going to take a long time no matter how efficent the query is. Deleting them in a loop that does 1000 at a time is often better for the database as it won't take a table lock and lock out users while the whole process is done. Again I don't know Derby, so I don't know if this is true of Derby, but it certainly would help a large delete in most databases I am familiar with.


DELETE from table1
WHERE 
   table1_id NOT IN (SELECT table1_id FROM table2)
   AND
   table1_id NOT IN (SELECT table1_id FROM table3)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜