开发者

How to INSERT multiple rows when some might be DUPLICATES of an already-existing row?

So I have a checkbox form where users can select multiple values. Then can then go back and select different values. Each value is stored as a row (UserID,value).

How do you do that INSERT when some rows might be duplicates of an already-existing row in the table?

Should I first delete the existing values and then INSERT the new values?

ON DUPLICATE KEY UPDATE seems tricky since I would be INSERTing multiple rows at once, so how would I define and separate just the ones that开发者_StackOverflow need UPDATING vs. the ones that need INSERTING?

For example, let's say a user makes his first-time selection:

INSERT INTO 
   Choices(UserID,value) 
VALUES 
   ('1','banana'),('1','apple'),('1','orange'),('1','cranberry'),('1','lemon')

What if the user goes back later and makes different choices which include SOME of the values in his original query which will thus cause duplicates?

How should I handle that best?


In my opinion, simply deleting the existing choices and then inserting the new ones is the best way to go. It may not be the most efficient overall, but it is simple to code and thus has a much better chance of being correct.

Otherwise it is necessary to find the intersection of the new choices and old choices. Then either delete the obsolete ones or change them to the new choices (and then insert/delete depending on if the new set of choices is bigger or smaller than the original set). The added risk of the extra complexity does not seem worth it.

Edit As @Andrew points out in the comments, deleting the originals en masse may not be a good plan if these records happened to be "parent" records in a referential integrity definition. My thinking was that this seemed like an unlikely situation based on the OP's description. But it is definitely worth consideration.


It's not clear to me when you would ever need to update a record in the database in your case.

It sounds like you need to maintain a set of choices per user, which the user may on occasion change. Therefore, each time the user provides a new set of choices, any prior set of choices should be discarded. So you would delete all old records, then insert any new ones.

You might consider carrying out a comparison of the prior and new choices - either in the server or client code - in order to calculate the minimum set of deletes and/or inserts needed to reduce database writes. But that smells like premature optimisation.

Putting all that to one side - if you want a re-insert to be ignored then you should use INSERT IGNORE, then existing rows will be quietly ignored and new ones will be inserted.


I don't know much about mysql but in MS SQL 2000+ we can execute a stored proc with XML as one of it's parameters. This XML would contain a list of identity-value pairs. We would open this XML as a table using openxml and figure out which rows need to be deleted or inserted using left or right outer join. As of SQL 2008 (I think) we have a new merge statement that let's us perform delete, update and insert row operations in one statement on ONE table. This way we can take advantage of Set mathematical operations from SQL instead of looping through arrays in the application code.

You can also keep your select list retrieved from the database in session and compare the "old list" to the "newly selected list" in your application code. You would need to figure out which rows need to be deleted or added. You probably don't need to worry about updates because you are probably only keeping foreign keys in this table and the descriptions are in some kind of a reference table.

There is another way in SQL 2008 that involves using user defined data-types as custom tables but I don't know much about it.

Personally, I prefer the XML route because you just send the end-state into the sp and your sp automatically figures out which rows need to deleted or inserted.

Hope this helps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜