Cassandra multiget performance

2023-02-27 10:41 问答作者：

I've got a cassandra cluster with a fairly small number of rows (2 million or so, which I would hope is "small" for cassandra). Each row is keyed on a unique UUID, and each row has about 200 columns (give or take a few). All in all these are pretty small rows, no binary data or large amounts of text. Just short strings.

I've just finished the initial import into the cassandra cluster from our old database. I've tuned the hell out of cassandra on each machine. There were hundreds of millions of writes, but no reads. Now that it's time to USE this thing, I'm finding that read speeds are absolutely dismal. I'm 开发者_JAVA百科doing a multiget using pycassa on anywhere from 500 to 10000 rows at a time. Even at 500 rows, the performance is awful sometimes taking 30+ seconds.

What would cause this type of behavior? What sort of things would you recommend after a large import like this? Thanks.

Sounds like you are io-bottlenecked. Cassandra does about 4000 reads/s per core, IF your data fits in ram. Otherwise you will be seek-bound just like anything else.

I note that normally "tuning the hell" out of a system is reserved for AFTER you start putting load on it. :)

See:

http://spyced.blogspot.com/2010/01/linux-performance-basics.html
http://www.datastax.com/docs/0.7/operations/cache_tuning

Is it an option to split up the multi-get into smaller chunks? By doing this you would be able to spread your get across multiple nodes, and potentially increase your performance, both by spreading the load across nodes and having smaller packets to deserialize.

That brings me to the next question, what is your read consistency set to? In addition to an IO bottleneck as @jbellis mentioned, you could also have a network traffic issue if you are requiring a particularly high level of consistency.

继续阅读：cassandra pycassa

Cassandra multiget performance

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？