How to load balance Cassandra cluster nodes?
I am using Cassandra-0.7.8 on cluster of 4 machines. I have uploaded some files using Map/Reduce. It looks files got distributed only among 2 nodes. When I used RF=3 it had 开发者_JAVA百科got distributed to equally 4 nodes on below configurations.
Here are some config info's:
- ByteOrderedPartitioner
- Replication Factor = 1 (since, I have storage problem. It will be increased later )
- initial token - value has not been set.
- create keyspace ipinfo with replication_factor = 1 and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';
[cassandra@cassandra01 apache-cassandra-0.7.8]$ bin/nodetool -h 172.27.10.131 ring Address Status State Load Owns Token
Token(bytes[fddfd9bae90f0836cd9bff20b27e3c04]) 172.27.10.132 Up Normal 11.92 GB 25.00% Token(bytes[3ddfd9bae90f0836cd9bff20b27e3c04]) 172.27.15.80 Up Normal 10.21 GB 25.00% Token(bytes[7ddfd9bae90f0836cd9bff20b27e3c04]) 172.27.10.131 Up Normal 54.34 KB 25.00% Token(bytes[bddfd9bae90f0836cd9bff20b27e3c04]) 172.27.15.78 Up Normal 58.79 KB 25.00% Token(bytes[fddfd9bae90f0836cd9bff20b27e3c04])
Can you suggest me how can I balance the load on my cluster.
Regards, Thamizhannal
The keys in the data you loaded did not get high enough to reach the higher 2 nodes in the ring. You could change to the RandomPartitioner as suggested by frail. Another option would be to rebalance your ring as described in the Cassandra wiki. This is the route you will want to take if you want to continue having your keys ordered. Of course as more data is loaded, you'll want to rebalance again to keep the distribution of data relatively even. If you plan on doing just random reads and no range slices then switch to the RandomPartitioner and be done with it.
If you want better loadbalance you need to change your partitioner to RandomPartitioner. But it would cause problems if you are using range queries in your application. You would better check this article :
Cassandra: RandomPartitioner vs OrderPreservingPartitioner
精彩评论