Delete query really slow - for largish amounts of data
I have a database with data for 24million users and I want to purge 23/24ths of the data evenly across all of the tables.
The delete statements are taking hours, is there anything I can do to improve the efficency of these query:
delete from [feedback] where [key] %24<>0 and [key] not in (50697,9179, 19392753, 3780,14095, 455141,455144,1576550,719307706,10233,706507,315321,2998138,19275591,730开发者_如何转开发37336,23371,11904062,08496,71959,79765,9969,02315,1850,666824,32289,826578,66284,718017,85204,192179,9406787,469844,9843,13801850,8575204,8927569)
If you intend to keep just 1/24th of the data, it will be quicker to select that data out into an alternative location, truncate the original table and copy it back.
If key constraints / operational constraints prevent this, then you should batch the deletions into chunks into a number of thousand rows per deletion, such as 10k rows per deletion and loop that command, to avoid one very large transaction (which is what is occuring with a single delete statement.)
If you can, consider dropping unused indexes and keys before performing deletes. The time updating indexes can be considerable. Don't forget to re-create the indexes when you are finished.
The scalable way (i.e. it doesn't matter how many or what percent of your data is being deleted) to do this is to "chunk up" your deletes. DELETE
is a logged action, which means that before the rows are actually deleted, they are first written to the log file, and for 24 million rows that could be a fairly big single write. Here's the basic template (untested, beware typos):
DECLARE @HowManyLastTime int
SET @HowManyLastTime = 1
WHILE @HowManyLastTime <> 0
BEGIN
DELETE top (10000)
from [feedback] where [key] %24<>0 and [key] not in (50697,9179, 19392753, 3780,14095, 455141,455144,1576550,719307706,10233,706507,315321,2998138,19275591,73037336,23371,11904062,08496,71959,79765,9969,02315,1850,666824,32289,826578,66284,718017,85204,192179,9406787,469844,9843,13801850,8575204,8927569)
SET @HowManyLastTime = @@rowcount
END
(10000 is an arbitrarily picked number, do some testing on various sizes to see what works best for your table.)
Will going the other way help?
CREATE Table FeedBack2 AS
SELECT * FROM Feedback
WHERE Key % 24 = 0
AND Key IN (50697,9179, 19392753, 3780,14095, 455141,455144,1576550,719307706,10233,706507,315321,2998138,19275591,73037336,23371,11904062,08496,71959,79765,9969,02315,1850,666824,32289,826578,66284,718017,85204,192179,9406787,469844,9843,13801850,8575204,8927569)
TRUNCATE TABLE Feedback
INSERT INTO Feedback
SELECT * FROM Feedback2
DROP Table Feedback2
EDIT: The SQL above isn't precise. But I hope you will get the idea based on the above.
精彩评论