开发者

SimpleDB parallelism

There is a comment in the SimpleDB documentation that states basically that if you need more parallelism then you should use multiple domains.

This leads me to this question. Does SimpleDB serialize all of it's requests even they come fro开发者_StackOverflow中文版m multiple client applications?

Does anyone have a definitive answer to that?


SimpleDB (per Netflix's lead engineer that blogs about their transition onto it) seems to rate-limit accounts per-domain. So you might have 1 account, and be doing queries or inserts from 10 threads to a single domain, and those (from what I gathered) will be rate limited to approximately 40-70 requests per second (I have seen varying reports).

The other thing to consider is that your domain grows in size, the query performance degrades.

Because of these 2 behaviors, it is recommended that for large data, you "shard" your data across multiple domains.

So consider a social app that tracks tweets, you might create the following 5 domains: TWEETS_0,TWEETS_1,TWEETS_2,TWEETS_3,TWEETS_4

then shard your inserts across them: int domainIndex = tweet.getId() % 5; simpleDB.doInsert(domainIndex, arguments...)

or some such pseudo-code. Aws recently upped the domain limit to 250 per customer so it seems this sharding design is expected to be used.

The pipe-dream promise of SimpleDB is "we scale, you worry about code", but the reality is that we just aren't there yet.

You still have to worry about a handful of details.


No, of course SimpleDB doesn't "serialize all of its requests"; however, it must do some amount of locking to ensure transactional consistency. Sharding across domains is an easy way to minimize the impact of this.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜