开发者

How do I mitigate duplicate row inserts based on a non-key column?

I need to import data from one MySQL table into another. The old table has a different outdated structure (which isn't terribly relevant). That said, I'm appending a field to the new table called "imported_id" which saves the original id from the old table in order to prevent duplicate imports of the old records.

My question now is, how do I actually prevent duplicates? Due to the parallel rollout of the new system with the old, the import will unfortunately need to be run more than once. I can't make the "import_id" field PK/UNIQUE because it will have null values for fields that do not come from the old table, thereby throwing an error when adding new fields. Is there a way to use some type of INSERT IGNORE on the fly for an arbitrary column that doesn't natively have constraints?

The more I think about this problem, the more I think I should handle it in the initial SELECT. However, I'd be interested in quality mechanisms by which to handle 开发者_开发百科this in general.

Best.


You should be able to create a unique key on the import_id column and still specify that column as nullable. It is only primary key columns that must be specified as NOT NULL.

That said, on the new table you could specify a unique key on the nullable import_id column and then handle any duplicate key errors when inserting from the old table into the new table using ON DUPLICATE KEY

Here's a basic worked example of what I'm driving at:

create table your_table
(id int unsigned primary key auto_increment,
someColumn    varchar(50) not null,
import_id int null,
UNIQUE KEY `importIdUidx1` (import_id)
);



insert into your_table (someColumn,import_id) values ('someValue1',1) on duplicate key update someColumn = 'someValue1';
insert into your_table (someColumn) values ('someValue2');
insert into your_table (someColumn) values ('someValue3');;
insert into your_table (someColumn,import_id) values ('someValue4',1) on duplicate key update someColumn = 'someValue4';

where the first and last inserts represent inserts from the old table and the 2nd and 3rd represent inserts from elsewhere.

Hope this helps and good luck!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜