开发者

With node-mongodb-native, when potentially inserting documents with duplicate _id's, is there any reason to check?

I'm writing up code using the node-mongodb-driver directly in nodeJS. I've set up a collection in my database that uses my own _id space, where each unique document I have is guaranteed to have a unique _id. That said, the way items get added to the database, there's a good chance that the same item will be inserted into the collection more than once, which means trying to use the same _id more than once.

What I'm doing right now to avoid any problems is to call collection.findOne(_id:ID) before inserting, to make sure that I don't try to insert docs that are already in the collection. However, since I'm adding lots of documents at a time and it needs to be asynchronous, I'm saving a large number of variables so that when findOne()'s callback is called, I can insert the right variable (if applicable).

I realized, however, that I can do away with saving variables if I just didn't bother to check whether or not a document already exists, and just went ahead and inserted them. If there already is a document in the collection with the same _id, I'll just end up getting an error saying that said _id already exists, and the code will keep running. Coding it like this would both decrease the running time of my software (less functions 开发者_StackOverflow社区are called) and the space in RAM that it's taking up (many less variables are being saved).

However, I wanted to see if anybody thought that there is any reason not to do this. When a function like insert() returns an error, is there anything bad that's happening or could be happening that I might not be aware?

Best, and thanks,

Sami


So your basic idea is correct, you are correct that findOne() does not solve the concurrency problem. But there are some wrinkles.

is there anything bad that's happening or could be happening that I might not be aware?

First problem is the insert may not be failing because of a duplicate error. Maybe it's failing because the DB is down or something else. So ensure that you're checking the error reason and handling appropriately.

Normally you don't want to throw lots of exceptions as they tend to be expensive. So watch that you're not doing this duplicate insert too often.

Second problem is tied to the insert data.

If server 1 generates an insert and server 2 generates an insert for the same document, do they generate the same insert statement?

  • If the answer is yes, then you're probably doing the right thing.
  • If the answer is no, then you may want to look at the upsert command. This does not work for all cases, but it may work for yours.
  • Additionally, there's also the findAndModify command. Instead of throwing exceptions, you can return the modified object. This has a larger learning curve, but it may be the best option.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜