How to atomically move rows from one table to another?
I am collecting readings from several thousand sensors and storing them in a MySQL database. There are several hundred inserts per second. To improve the insert performance I am storing the values initially into a MEMORY buffer table. Once a minute I run a stored procedure which moves the inserted rows from the memory buffer to a permanent table.
Basically I would like to do the following in my stored procedure to move the rows from the temporary buffer:
INSERT INTO data SELECT * FROM data_buffer;
DELETE FROM data_buffer;
Unfortunately the previous is not usable because the data collection processes insert additional rows in "data_buffer" between INSERT and DELETE above. Thus those rows will get deleted开发者_如何学运维 without getting inserted to the "data" table.
How can I make the operation atomic or make the DELETE statement to delete only the rows which were SELECTed and INSERTed in the preceding statement?
I would prefer doing this in a standard way which works on different database engines if possible.
I would prefer not adding any additional "id" columns because of performance overhead and storage requirements.
I wish there was SELECT_AND_DELETE or MOVE statement in standard SQL or something similar...
I beleive this will work but will block until insert is done
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
INSERT INTO data (SELECT * FROM data_buffer FOR UPDATE);
DELETE FROM data_buffer;
COMMIT TRANSACTION;
A possible way to avoid all those problems, and to also stay fast, would be to use two data_buffer
tables (let's call them data_buffer1
and data_buffer2
); while the collection processes insert into data_buffer2
, you can do the insert
and delete
on data_buffer2
; than you switch, so collected data goes into data_buffer2
, while data is inserted+deleted from data_buffer1
into data
.
How about having a row id, get the max value before insert, make the insert and then delete records <= max(id)
This is a similar solution to @ammoQ's answer. The difference is that instead of having the INSERTing process figure out which table to write to, you can transparently swap the tables in the scheduled procedure.
Use RENAME in the scheduled procedure to swap tables:
CREATE TABLE IF NOT EXISTS data_buffer_new LIKE data_buffer;
RENAME TABLE data_buffer TO data_buffer_old, data_buffer_new TO data_buffer;
INSERT INTO data SELECT * FROM data_buffer_old;
DROP TABLE data_buffer_old;
This works because RENAME statement swaps the tables atomically, thus the INSERTing processes will not fail with "table not found". This is MySQL specific though.
I assume the tables are identical, with the same columns and primary key(s)? If that is the case, you could nestled select inside a where clause...something like this:
DELETE FROM data_buffer
WHERE primarykey IN (SELECT primarykey FROM data)
This is a MySQL specific solution. You can use locking to prevent the INSERTing processes from adding new rows while you are moving rows.
The procedure which moves the rows should be as follows:
LOCK TABLE data_buffer READ;
INSERT INTO data SELECT * FROM data_buffer;
DELETE FROM data_buffer;
UNLOCK TABLE;
The code which INSERTs new rows in the buffer should be changed as follows:
LOCK TABLE data_buffer WRITE;
INSERT INTO data_buffer VALUES (1, 2, 3);
UNLOCK TABLE;
The INSERT process will obviously block while the lock is in place.
精彩评论