YCSB - why I can never find a shard configuration anywhere on internet
I see all kinds of reference to MongoDB as a client for the YCSB benchmarks to test NoSQL database server scalability / elasticity.
https://github.com/brianfrankcooper/YCSB
However, it is clear that the benchmark would require some kind of sharding setup, because the tests are designed to run on 6 to 10 server machines to show the scaling and elasticity.
I cannot find any reference on the internet for what that configuration looks like with MongoDB. I cannot find anyone who published results who also published what their configuration looks like.
Was this thing really done successfully? What are the results compared to the original YCSB clients like Cassandra, HBase, etc.
I am especially confused because, In the code of the MongoDB client it reads ..... "there is one DB instance per client thread" ...see snippet.
public class MongoDbClient extends DB {
private static final Logger logger = LoggerFactory.getLogger(MongoDbClient.class);
private Mongo mongo;
private WriteConcern writeConcern;
private String database;
/**
* Initialize any state for this DB. Called once per DB instance; there is
* one DB instance per client thread.
*/
public void init() throws DBExcept开发者_如何学JAVAion {
// initialize MongoDb driver
Properties props = getProperties();
......
However, in the Brian Cooper YCSB results paper, it states that they ran their workloads up to 500 threads.
6.1 Experimental Setup
For most experiments, we used six server-class machines (dual 64-bit quad core 2.5 GHz Intel Xeon CPUs, 8 GB of RAM, 6 disk RAID-10 array and gigabit ethernet) to run each system. We also ran PNUTS on a 47 server cluster to successfully demonstrate that YCSB can be used to benchmark larger systems. PNUTS required two additional machines to serve as a configuration server and router, and HBase required an additional machine called the “master server.” These servers were lightly loaded, and the results we report here depend primarily on the capacity of the six storage servers. The YCSB Client ran on a separate 8 core machine. The Client was run with up to 500 threads, depending on the desired offered throughput. We observed in our tests that the client machine was not a bottleneck; in particular, the CPU was almost idle as most time was spent waiting for the database system to respond.
Does anyone know where there is a sharding configuration for this benchmark and are there any real results against the competition that can be backed up by a shard configuration or a detailed explaination of why a shard would not be necessary.
Thanks, -Robert
We did not include MongoDB as part of our initial YCSB study. The Mongo client was contributed later by another developer, but I haven't run the full benchmark against Mongo so I don't know whether the client really does everything it needs to. If it doesn't, go ahead and submit a patch and I'll try to include it!
Also, the "one DB instance per client thread" comment means one instance of the DB client class in the JVM, not necessarily one MongoDB server.
精彩评论