How to optimize this sql delete statement
I have 3 tables, the first one, table1, has as primary key the id column, the second table (table2) has a column table1_id that refer as foreign key to the table1.id, the third table (table3) has, as table2, a column开发者_Python百科 table1_id that refer as foreign key to table1.id.
I have to delete from table1 all the rows where table1.id is not in table2.table1_id and not in table3.table1_id
now i am using this query:
DELETE FROM table1
WHERE table1.id IN (SELECT table1.id
FROM (table2
RIGHT OUTER JOIN table1
ON table2.table1_id = table1.id)
LEFT OUTER JOIN table3
ON table3.table1_id = table1.id
WHERE table2.table1_id IS NULL
AND table3.table1_id IS NULL);
but it is very slow, it takes a lot of time, there are some better approach to this delete statement?
If this can help i can assume that table2 has more data that table3.
The database i am using is Apache Derby.
Thanks for the help.
Assuming you got the obvious covered (indices created for table1.id
, table2.table1_id
and table3.table1_id
), you don't need to perform full outer joins just to test if a key is in another table, you can use subqueries and exists()
-- or not exists()
in your case.
And since you're only testing for existence, you can use the following pattern:
where not exists ( select top 1 1 from... where... )
DELETE table1
FROM table1
LEFT JOIN table2 ON table1.id = table2.table1_id
LEFT JOIN table3 ON table1.id = table3.table1_id
WHERE table2.table1_id IS NULL
AND table3.table1_id IS NULL
Do you know how many rows you are deleting? I agree with @Blindy, that not exists would probably be better in your case if Derby supports it (I don't know Derby so I can't say for sure). However, if there are a lot of records being deleted, you might want to do this in batches. Deleting a 10,000,000 records is going to take a long time no matter how efficent the query is. Deleting them in a loop that does 1000 at a time is often better for the database as it won't take a table lock and lock out users while the whole process is done. Again I don't know Derby, so I don't know if this is true of Derby, but it certainly would help a large delete in most databases I am familiar with.
DELETE from table1
WHERE
table1_id NOT IN (SELECT table1_id FROM table2)
AND
table1_id NOT IN (SELECT table1_id FROM table3)
精彩评论