开发者

MongoDB sharding scalability - performance of queries hitting a single chunk?

In doing some preliminary tests of MongoDB sharding, I hoped and expected that the time to execute queries that hit only a single chunk of data on one shard/machine would remain relatively constant as more data was loaded. But I found a significant slowdown.

Some details:

For my simple test, I used two machines to shard and tried queries on similar collections with 2 million rows and 7 million rows. These are obviously very small collections that don’t even require sharding, yet I was surprised to already see a significant consistent slowdown for queries hitting only a single chunk. Queries included the sharding key, were for result sets ranging from 10s to 100000s of rows开发者_如何转开发, and I measured the total time required to scroll through the entire result sets. One other thing: since my application will actually require much more data than can fit into RAM, all queries were timed based on a cold cache.

Any idea why this would be? Has anyone else observed the same or contradictory results?


Further details (prompted by Theo):

For this test, the rows were small (5 columns including _id), and the key was not based on _id, but rather on a many-valued text column that almost always appears in queries.

The command db.printShardingStatus() shows how many chunks there are as well as the exact key values used to split ranges for chunks. The average chunk contains well over 100,000 rows for this dataset and inspection of key value splits verifies that the test queries are hitting a single chunk.

For the purpose of this test, I was measuring only reads. There were no inserts or updates.


Update:

Upon some additional research, I believe I determined the reason for the slowdown: MongoDB chunks are purely logical, and the data within them is NOT physically located together (source: "Scaling MongoDB" by Kristina Chodorow). This is in contrast to partitioning in traditional databases like Oracle and MySQL. This seems like a significant limitation, as sharding will scale horizontally with the addition of shards/machines, but less well in the vertical dimension as data is added to a collection with a fixed number of shards.

If I understand this correctly, if I have 1 collection with a billion rows sharded across 10 shards/machines, even a query that hits only one shard/machine is still querying from a large collection of 100 million rows. If values for the sharding key happen to be located contiguously on disk, then that might be OK. But if not and I'm fetching more than a few rows (e.g. 1000s), then this seems likely to lead to lots of I/O problems.

So my new question is: why not organize chunks in MongoDB physically to enable vertical as well as horizontal scalability?


What makes you say the queries only touched a single chunk? If the result ranged up to 100 000 rows it sounds unlikely. A chunk is max 64 Mb, and unless your objects are tiny that many won't fit. Mongo has most likely split your chunks and distributed them.

I think you need to tell us more about what you're doing and the shape of your data. Were you querying and loading at the same time? Do you mean shard when you say chunk? Is your shard key something else than _id? Do you do any updates while you query your data?

There are two major factors when it comes to performance in Mongo: the global write lock and it's use of memory mapped files. Memory mapped files mean you really have to think about your usage patterns, and the global write lock makes page faults hurt really badly.

If you query for things that are all over the place the OS will struggle to page things in and out, this can be especially hurting if your objects are tiny because whole pages have to be loaded just to access a small pieces, lots of RAM will be wasted. If you're doing lots of writes that will lock reads (but usually not that badly since writes happen fairly sequentially) -- but if you're doing updates you can forget about any kind of performance, the updates block the whole database server for significant amounts of time.

Run mongostat while you're running your tests, it can tell you a lot (run mongostat --discover | grep -v SEC to see the metrics for all you shard masters, don't forget to include --port if your mongos is not running on 27017).


Addressing the questions in your update: it would be really nice if Mongo did keep chunks physically together, but it is not the case. One of the reasons is that sharding is a layer on top of mongod, and mongod is not fully aware of it being a shard. It's the config servers and mongos processes that know of shard keys and which chunks that exist. Therefore, in the current architecture, mongod doesn't even have the information that would be required to keep chunks together on disk. The problem is even deeper: Mongo's disk format isn't very advanced. It still (as of v2.0) does not have online compaction (although compaction got better in v2.0), it can't compact a fragmented database and still serve queries. Mongo has a long way to go before it's capable of what you're suggesting, sadly.

The best you can do at this point is to make sure you write the data in order so that chunks will be written sequentially. It probably helps if you create all chunks beforehand too, so that data will not be moved around by the balancer. Of course this is only possible if you have all your data in advance, and that seems unlikely.


Disclaimer: I work at Tokutek

So my new question is: why not organize chunks in MongoDB physically to enable vertical as well as horizontal scalability?

This is exactly what is done in TokuMX, a replacement server for MongoDB. TokuMX uses Fractal Tree indexes which have high write throughput and compression, so instead of storing data in a heap, data is clustered with the index. By default, the shard key is clustered, so it does exactly what you suggest, it organizes the chunks physically, by ensuring all documents are ordered by the shard key on disk. This makes range queries on the shard key fast, just like on any clustered index.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜