开发者

How scalable are automatic secondary indexes in Cassandra 0.7?

As far as I understand开发者_StackOverflow社区 automatic secondary indexes are generated for node local data.

In this case query by secondary index involve all nodes storing part of column family to get results (?) so (if i am right) if data is spread across 50 nodes then 50 nodes are involved in single query?

How far can this scale? Is this more scalable than manual secondary indexes (inverted index column family)? Few nodes or hundred nodes?


See Stu's answer from the ml http://www.mail-archive.com/user@cassandra.apache.org/msg10506.html


Yes, if you need to fetch all indexed rows, then the index queries involve all nodes. But this is actually more efficient, than building your own index! Details here.

However, if you lookup only a few rows, and each index entry maps to very many rows, then it's likely that the very first node is able to answer your question. Your query will then involve only one node. From the Apache mailing list:

The first node can answer the question as long as you've requested less rows than the first node has on it. Hence the "low cardinality" point in what you quoted.

(by Jonathan Ellis, here.)

(I also posted a question on the mailing list, a follow up question to your question, inquisitor, because I didn't really understand the answer to your question (linked in Schildmeijer's answer).)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜