MongoDB sharding, how does it rebalance when adding new nodes?
I'm trying to understand MongoDB and the concept of sharding. If we start with 2 no开发者_如何转开发des and partition say, customer data, based on last name where A thru M data is stored on node 1 and N thru Z data is stored on node 2. What happens when we want to scale out and add more nodes? I just don't see how that will work.
If you have 2 nodes it doesn't mean that data is partitioned into 2 chunks. It can by partitioned to let's say 10 chunks and 6 of them are on server 1 ane rest is on server 2.
When you add another server MongoDB is able to redistribute those chunks between nodes of new configuration
You can read more in official docs:
http://www.mongodb.org/display/DOCS/Sharding+Introduction
http://www.mongodb.org/display/DOCS/Choosing+a+Shard+Key
If there are multiple shards available, MongoDB will start migrating data to other shards once you have a sufficient number of chunks. This migration is called balancing and is performed by a process called the balancer.The balancer moves chunks from one shard to another.
For a balancing round to occur, a shard must have at least nine more chunks than the least-populous shard. At that point, chunks will be migrated off of the crowded shard until it is even with the rest of the shards.
When you add new node to cluster, MongoDB redistribute those chunks among nodes of new configuration. It's a little extract ,to get complete understanding of how does it rebalance when adding new node read chapter 2 "Understanding Sharding" of Kristina Chodrow's book "Scaling MongoDB"
精彩评论