20 hour query from hell

2023-02-01 21:23 问答作者：

i am running this query on a table that has half a million records with about 7 fields:

delete from qvalues where rid not in  
(
select min(rid) from qvalues
group by name,compound,rowid
having COUNT(*)>1)

and rid not in  (select min(rid) from qvalues

group by name,compound,rowid
having COUNT(*)=1);

why is it taking 开发者_Go百科SO LONG?

what can i do to optimize it?

im running sql server 2008

Your best bet is to look at the execution plan and see what's taking the longest. I'd start by reducing the two not in queries to one:

delete 
from qvalues 
where rid not in
(
    select min(rid)
    from qvalues
    group by name, compound, rowid
    having count(1) >= 1
)

You might also want to look into putting an index on name, compound and rowid

As well as considering batching and indexing you can also rewrite the query itself to remove the sub queries and be more efficient.

;WITH cte As
(
SELECT ROW_NUMBER() OVER (PARTITION BY name, compound, rowid ORDER BY rid) AS RN
FROM qvalues
)
DELETE FROM cte WHERE RN > 1

How many duplicates per group will there be likely to be? If many it might be quicker to do an insert of the records to keep into a new table and then a drop and rename.

Not knowing the actual data involved, I can just give some general advice: Run each of the subqueries individually.

Also, am I reading this wrong, or are you deleting all but 2 entries from this table (if rid is unique)?

1 - Use batching. This lets you resume, and gives you an idea of progress. As an example:

DECLARE @MSG Varchar(max)

WHILE 1=1
    BEGIN
        DELETE TOP (100000) qvalues
        FROM qvalues WITH (TABLOCKX)
        <logic here>
        IF @@ROWCOUNT < 100000 BREAK        
        SET @Msg = 'Deleted another 10 Million'
        SET @Msg = @Msg + ' ' +CONVERT(varchar(20),GETDATE(),101)+' '+CONVERT(varchar(20),GETDATE(),108) 
        RAISERROR(@Msg, 0, 1) WITH NOWAIT
    END

Note that I also added a WITH (TABLOCKX) hint, which puts a table lock on and eliminates row-level locking. It'll cause issues with concurrent reads but hopefully you don't have anything else querying that table while you are deleting.

2 - Fix your logic This is impossible to write for you without a better idea of your table structure, but some options are: - Materialize a table with the values you want to compare against and do a join. If the delete is big enough you can make a clustered index on the temp table on the join field. I've used this a lot with great success. - If you expect to delete a large portion of the records, SELECT INTO a new table and drop the old one. This is a minimally logged operation and runs really quickly on SQL Server 2008 compared to a delete, which needs to log the values for each row. - Drop all your indexes but what you are using for selection and your clustered index. Keeping a clustered index is normally OK for a delete of this type if it's a relevant cluster to the query.

first thought:

delete from qvalues where rid not in  
(
select min(rid) from qvalues
group by name,compound,rowid
having COUNT(*)>1

UNION

select min(rid) from qvalues
group by name,compound,rowid
having COUNT(*)=1);

Maybe its also a good idea to ensure, that the sql-server knows, that you are doing a "uncorrelated subselect" (because "correlated subselects" take much longer):

delete from qvalues a where a.rid not in  
(
select min(b.rid) from qvalues b
group by b.name,b.compound,b.rowid
having COUNT(*)>1

UNION

select min(c.rid) from qvalues c
group by c.name,c.compound,c.rowid
having COUNT(*)=1);

and of course you should consider to use indexes (especially on rid, but also on name, compound, rowid)

My SQLs are not tested - I hope you get the idea of what I was trying to show.

PS: your sql requires a lot of calculations (especially the HAVING clauses), could you try to find another solution for your problem?

What is your machines setup, do you have enough memory, where do you see the most utilisation while the query runs (CPU, Memory, Disk IO) ?

继续阅读：sql sql-server-2008 tsql

20 hour query from hell

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？