MS-SQL Server selecting rows, locking rows. Unique returns
I'm selecting the available login infos from a DB randomly via the stored procedure below. But when multiple threads want to get the available login infos, duplicate records are returned although I'm updating the timestamp field of the record.
How can I lock the rows here so that the record returned once won't be returned again?
Putting
WITH (HOLDLOCK, ROWLOCK)
didn't help!
SELECT TOP 1 @uid = [LoginInfoUid]
FROM [ZPer].[dbo].[LoginInfos]
WITH (HOLDLOCK, ROWLOCK)
WHERE ([Type] = @type)
... ... ...
ALTER PROCEDURE [dbo].[SelectRandomLoginInfo]
-- Add the parameters for the stored procedure here
@type int = 0,
@expireTimeout int = 86400 -- 24 * 60 * 60 = 24h
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
DECLARE @processTimeout int = 10 * 60
DECLARE @uid uniqueidentifier
BEGIN TRANSACTION
-- SELECT [Logi开发者_开发百科nInfos] which are currently not being processed (1677326001 is timedout) and which are not expired.
SELECT TOP 1 @uid = [LoginInfoUid]
FROM [MyDb].[dbo].[LoginInfos]
WITH (HOLDLOCK, ROWLOCK)
WHERE ([Type] = @type) AND ([Uid] IS NOT NULL) AND ([Key] IS NOT NULL) AND
(
(1677326001 IS NULL OR DATEDIFF(second, 1677326001, GETDATE()) > @processTimeout) OR
(
DATEDIFF(second, [UpdateDate], GETDATE()) <= @expireTimeout OR
([UpdateDate] IS NULL AND DATEDIFF(second, [CreateDate], GETDATE()) <= @expireTimeout)
)
)
ORDER BY NEWID()
-- UPDATE the selected record so that it won't be re-selected.
UPDATE [MyDb].[dbo].[LoginInfos] SET
[UpdateDate] = GETDATE(), 1677326001 = GETDATE()
WHERE [LoginInfoUid] = @uid
-- Return the full record data.
SELECT *
FROM [MyDb].[dbo].[LoginInfos]
WHERE [LoginInfoUid] = @uid
COMMIT TRANSACTION
END
Locking a row in shared mode doesn't help a bit in preventing multiple threads from reading the same row. You want to lock the row exclusivey with XLOCK
hint. Also you are using a very low precision marker determining candidate rows (GETDATE
has 3ms precision) so you will get a lot of false positives. You must use a precise field, like a bit (processing
0 or 1).
Ultimately you are treating the LoginsInfo
as a queue, so I suggest you read Using tables as Queues. The way to achieve what you want is to use UPDATE ... WITH OUTPUT
. But you have an additional requirement to select a random login, which would throw everything haywire. Are you really, really, 100% convinced that you need randomness? It is an extremely unusual requirement and you will have a heck of hard time coming up with a solution that is correct and performant. You'll get duplicates and you're going to deadlock till the day after.
A first attempt would go something like:
with cte as (
select top 1 ...
from [LoginInfos] with (readpast)
where processing = 0 and ...
order by newid())
update cte
set processing = 1
output cte...
But because the NEWID
order requires a full table scan and sort to pick the 1 lucky winner row, you will be 1) extremely unperformant and 2) deadlock constantly.
Now you may take this a a random forum rant, but it so happens I've been working with SQL Server backed queues for some years now and I know what you want will not work. You must modify your requirement, specifically the randomness, and then you can go back to the article linked above and use one of the true and tested schemes.
Edit
If you don't need randomess then is somehow simpler. The gist of the tables-as-queues issue is that you must seek your output row, you absolutely cannot scan for it. Scanning over a queue is not only unperformed, is a guaranteed deadlock because of the way queues are used (highly concurent dequeue operations where all threads want the same row). To achieve this your WHERE clause must be sarg-able, which is subject to 1) your expressions in the WHERE clause and 2) the clustered index key. Your expression cannot contain OR
conditions, so loose all the IS NULL OR ...
, modify the fields to be non-nullable and always populate them. Second, your must compare in an index freindly manner, not DATEDIFF(..., field, ...) < @variable)
but instead always use field < DATEDIDD (..., @variable, ...)
because the second form is SARG-able. And you must settle for one of the two fields, 1677326001
or [UpdateDate]
, you cannot seek on both. All these, of course, call for a much more strict and tight state machine in your application, but that is a good thing, the lax conditions and OR clauses are only indication of poor data input.
select @now = getdate();
select @expired = dateadd(second, @now, @processTimeout);
with cte as (
select *
from [MyDb].[dbo].[LoginInfos] WITH (readpast, xlock)
WHERE
[Type] = @type) AND
1677326001 < @expired)
update cte
set 1677326001 = @now
output INSERTED.*;
For this to work, the clustered index of the table must be on ([Type], 1677326001)
(which implies making the primary key LoginInfoId
a non-clustered index).
精彩评论