How safe is MongoDB's safe mode on inserts?

2023-03-27 03:57 问答作者：

I am working on a project which has some important data in it. This means we cannot to lose any of it if the light or server goes down. We are using MongoDB for the database. I'd like to be sure that my data is in the database after the insert and rollback the whole batch if one element was not inserted. I开发者_JS百科 know it is the philosophy behind Mongo that we do not need transactions but how can I make sure that my data is really safely stored after insert rather than sent to some "black hole".

Should I make a search?
Should I use some specific mongoDB commands?
Should I use sharding even if one server is enough for satisfying
the speed and by the way it doesn't guarantee anything if the light
goes down?

What is the best solution?

Your best bet is to use Write Concerns - these allow you to tell MongoDB how important a piece of data is. The quickest Write Concern is also the least safe - the data is not flushed to disk until the next scheduled flush. The safest will confirm that the data has been written to disk on a number of machines before returning.

The write concern you are looking for is FSYNC_SAFE (at least that is what it is called from the point of view of the Java driver) or REPLICAS_SAFE which confirms that your data has been replicated.

Bear in mind that MongoDB does not have transactions in the traditional sense - your rollback will have to be rolled by hand as you can't tell the Mongo database to do this for you.

The other thing you need to do is either use the relatively new --journal option (which uses a Write Ahead Log), or use replica sets to share your data across many machines in order to maximise data integrity in the event of a crash/power loss.

Sharding is not so much a protection against hardware failure as a method for sharing the load when dealing with particularly large datasets - sharding shouldn't be confused with replica sets which is a way of writing data to more than one disk on more than one machine.

Therefore, if your data is valuable enough, you should definitely be using replica sets, perhaps even siting slaves in other data centres/availability zones/racks/etc in order to provide the resilience you require.

There is/will be (can't remember offhand whether this has been implemented yet) a way to specify the priority of individual nodes in a replica set such that if the master goes down the new master that is elected is one in the same data centre if such a machine is available (ie to stop a slave on the other side of the country from becoming master unless it really is the only other option).

I received a really nice answer from a person called GVP on google groups. I will quote it(basically it adds up to Rich's answer):

I'd like to be sure that my data is in the database after the insert and rollback the whole batch if one element was not inserted.

This is a complex topic and there are several trade-offs you have to consider here.

Should I use sharding?

Sharding is for scaling writes. For data safety, you want to look a replica sets.

Should I use some specific mongoDB commands?

First thing to consider is "safe" mode or "getLastError()" as indicated by Andreas. If you issue a "safe" write, you know that the database has received the insert and applied the write. However, MongoDB only flushes to disk every 60 seconds, so the server can fail without the data on disk.

Second thing to consider is "journaling" (v1.8+). With journaling turned on, data is flushed to the journal every 100ms. So you have a smaller window of time before failure. The drivers have an "fsync" option (check that name) that goes one step further than "safe", it waits for acknowledgement that the data has be flushed to the disk (i.e. the journal file). However, this only covers one server. What happens if the hard drive on the server just dies? Well you need a second copy.

Third thing to consider is replication. The drivers support a "W" parameter that says "replicate this data to N nodes" before returning. If the write does not reach "N" nodes before a certain timeout, then the write fails (exception is thrown). However, you have to configure "W" correctly based on the number of nodes in your replica set. Again, because a hard drive could fail, even with journaling, you'll want to look at replication. Then there's replication across data centers which is too long to get into here. The last thing to consider is your requirement to "roll back". From my understanding, MongoDB does not have this "roll back" capacity. If you're doing a batch insert the best you'll get is an indication of which elements failed.

Here's a link to the PHP driver on this one: http://it.php.net/manual/en/mongocollection.batchinsert.php You'll have to check the details on replication and the W parameter. I believe the same limitations apply here.

继续阅读：batch-file mongodb safe-mode

How safe is MongoDB's safe mode on inserts?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？