开发者

Have "select for update" block on nonrexisting rows

we have some persistent data in an application, that is queried from a server and then stored in a database so we can keep track of additional information. Because we do not want to query when an object is used in the memory we do an select for update so that other threads that want to get the same data will be blocked.

I am not sure how select for update handles non-existing rows. If the row does not exist and another thread tries to do another select for update on the same row, will this thread be blocked until the other transaction finishes or will it also get an empty result set? If it does only get an empty result set is there any way to make it block as well, for example by inserting the missing row immediately?

EDIT:

Because there was a remark, that we might lock too much, here some more details on the concrete usage in our case. In reduced pseudocode our programm flow looks like this:

d = queue.fetch();
r = SELECT * FROM table WHERE key = d.key() FOR UPDATE;
if r.empty() then
  r = get_data_from_somewhere_else();

new_r = process_stuff( r );


if Data was present then
   update row to new_r
else
   insert new_r

This code is run in multiple thread and the data that is fetched from the queue might be concerning the same row in the database (hence the lock). However if multiple threads are using data that needs the same row, then these threads need to be sequentialized (order does not matter). However this sequentialization fails, if t开发者_Go百科he row is not present, because we do not get a lock.

EDIT:

For now I have the following solution, which seems like an ugly hack to me.

select the data for update
if zero rows match then
  insert some dummy data   // this will block if multiple transactions try to insert
  if insertion failed then
    // somebody beat us at the race
    select the data for update

do processing

if data was changed then
   update the old or dummy data
else
   rollback the whole transaction

I am neither 100% sure however that this actually solves the problem, nor does this solution seem good style. So if anybody has to offer something more usable this would be great.


I am not sure how select for update handles non-existing rows.

It doesn't.

The best you can do is to use an advisory lock if you know something unique about the new row. (Use hashtext() if needed, and the table's oid to lock it.)

The next best thing is a table lock.

That being said, your question makes it sound like you're locking way more than you should. Only lock rows when you actually need to, i.e. write operations.


Example solution (i haven't found better :/)

Thread A:

BEGIN;
SELECT pg_advisory_xact_lock(42); -- database semaphore arbitrary ID
SELECT * FROM t WHERE id = 1;
DELETE FROM t WHERE id = 1;
INSERT INTO t (id, value) VALUES (1, 'thread A');
SELECT 1 FROM pg_sleep(10); -- only for race condition simulation
COMMIT;

Thread B:

BEGIN;
SELECT pg_advisory_xact_lock(42); -- database semaphore arbitrary ID
SELECT * FROM t WHERE id = 1;
DELETE FROM t WHERE id = 1;
INSERT INTO t (id, value) VALUES (1, 'thread B');
SELECT 1 FROM pg_sleep(10); -- only for race condition simulation
COMMIT;

Causes always correct order of transactions execution.


Looking at the code added in the second edit, it looks right.

As for it looking like a hack, there's a couple options - basically it's all about moving the database logic to the database.

One is simply to put the whole select for update, if not exist then insert logic in a function, and do select get_object(key1,key2,etc) instead.

Alternatively, you could make an insert trigger that will ignore attempts to add an entry if it already exists, and simply do an insert before you do the select for update. This does have more potential to interfere with other code already in place, though.

(If I remember to, I'll edit and add example code later on when I'm in a position to check what I'm doing.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜