updlock vs for update cursor

2023-03-15 23:47 问答作者：

I need to update a column of all rows of a table and I need to use UPDLOCK to do it.

For example:

UPDATE table  (UPDLock)
   SET column_name = ‘123’

Another alternative is to use an for update cursor and update each row. The advantage with the second approach is that the lock is not held till the end of the transaction and concurrent updates of the same rows can happen sooner. At the same time update cursors are said to have bad performance. Which is a better approach?

EDIT:

Assume the column is updated with a value th开发者_Python百科at is derived from another column in the table. In other words, column_name = f(column_name_1)

You cannot give an UPDLOCK hint to a write operation, like UPDATE statement. It will be ignored, since all writes (INSERT/UPDATE/DELETE) take the same lock, an exclusive lock on the row being updated. You can quickly validate this yourself:

create table heap (a int);
go

insert into heap (a) values (1)
go

begin transaction
update heap 
   --with (UPDLOCK) 
   set a=2

select * from sys.dm_tran_locks
rollback

If you remove the comment -- on the with (UPDLOCK) you'll see that you get excatly the same locks (an X lock on the physical row). You can do the same experiment with a B-Tree instead of a heap:

create table btree (a int not null identity(1,1) primary key, b int)
go

insert into btree (b) values (1)
go

begin transaction
update btree 
   --with (UPDLOCK) 
   set b=2

select * from sys.dm_tran_locks
rollback

Again, the locks acquired will be identical with or w/o the hint (an exclusive lock on the row key).

Now back to your question, can this whole table update be done in batches? (since this is basically what you're asking). Yes, if the table has a primary key (to be precise what's required is an unique index to batch on, preferable the clustered index to avoid tipping point issues). Here is an example how:

create table btree (id int not null identity(1,1) primary key, b int, c int);
go

set nocount on;
insert into btree (b) values (rand()*1000);
go 1000

declare @id int = null, @rc int;
declare @inserted table (id int);
begin transaction;
-- first batch has no WHERE clause
with cte as (
    select top(10) id, b, c
    from btree
    order by id)
update cte
    set c = b+1
    output INSERTED.id into @inserted (id);
set @rc = @@rowcount;
commit;

select @id = max(id) from @inserted;
delete from @inserted;
raiserror (N'Updated %d rows, up to id %d', 0,0,@rc, @id);

begin transaction;
while (1=1)
begin
    -- update the next batch of 10 rows, now it has where clause
    with cte as (
        select top(10) id, b, c
        from btree
        where id > @id
        order by id)
    update cte
        set c = b+1
        output INSERTED.id into @inserted (id);
    set @rc = @@rowcount;
    if (0 = @rc)
        break;
    commit;
    begin transaction;
    select @id = max(id) from @inserted;
    delete from @inserted;
    raiserror (N'Updated %d rows, up to id %d', 0,0,@rc, @id);
end 
commit
go

If your table doesn't have a unique clustered index then it becomes really tricky to do this, you would need to do the same thing a cursor has to do. While from a logical point of view the index is not required, not having it would cause each batch to do a whole-table-scan, which would be pretty much disastrous.

In case you wonder what happens if someone inserts a value behind the current @id, then the answer is very simple: the exactly same thing that would happen if someone inserts a value after the whole processing is complete.

Personally I think the single UPDATE will be much better. There are very few cases where a cursor will be better overall, regardless of concurrent activity. In fact the only one that comes to mind is a very complex running totals query - I don't think I've ever seen better overall performance from a cursor that is not read only, only SELECT queries. Of course, you have much better means of testing which is "a better approach" - you have your hardware, your schema, your data, and your usage patterns right in front of you. All you have to do is perform some tests.

That all said, what is the point in the first place of updating that column so that every single row has the same value? I suspect that if the value in that column has no bearing to the rest of the row, it can be stored elsewhere - perhaps a related table or a single-row table. Maybe the value in that column should be NULL (in which case you get it from the other table) unless it is overriden for a specific row. It seems to me like there is a better solution here than touching every single row in the table every time.

继续阅读：sql sql-server tsql

updlock vs for update cursor

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？