开发者

Cassandra or mysql 5 ? Which will be good for future?

Should I use Cassandra for a 100,000 user project? In MySQL 5, I have full-text search and table partitioning. I'm starting a Q&A system like SO with CodeIgniter. It's a move from vBulletin to a new system. In开发者_Go百科 the old vBulletin system I had 100,000 users, with a total post count around 80,000. In the next 3 or 4 years, I expect there will be more and more users and posts both. So, should I use Cassandra instead of MySQL 5?

If I use Cassandra, I need to change from Grid-Service to Dedicated-Virtual hosting at Media Temple. Because Cassandra is not provided as part of a hosting system, I need to use a VPS or DV server solution. If I use MySQL, hosting is not a problem, but then what about performances, search speed.

By the way, what database is Stack Overflow using?


From the information you provided, I would suggest to stick to MySQL.

Just as a side-note, Facebook was using MySQL at first, and eventually moved to Cassandra only after it was storing over 7 Terabytes of inbox data, for over 100 million users.

  • Source: Lakshman, Malik: Cassandra - A Decentralized Structured Storage System.

Wikipedia also handles hundreds of Gigabytes of text data in MySQL.


You say 100,000 users - but how many concurrent users?

Cassandra is not built in hosting system

Using a hosted service on a single server suggests a very small scale operation - and your obviously limited by your budget. There's certainly no advantage running Cassandra on a single server node.

In mysql 5 have full text search

Which is not a very scalable solution - you should definitely think about using a normalized search (which I believe you'd have to do if you were migrating to Cassandra anyway).

Given that you can comfortably scale the MySQL solution to multiple databases using replication before you even think about fully clustered solution, and you obviously don't have the budget to do your own hosting, migrating to Cassandra seems like a massive overkill.


I would NOT recommend you using cassandra in your case for the following reasons:

  1. Cassandra needs good understanding of the application you're building. It will be much harder to make changes and to run complex queries against data stored in cassandra. SQL is more flexible and easier to maintain. Cassandra is good when you need to store huge amounts of data and when you know exactly how the data stored in cassandra will be accessed and sorted.

  2. Mysql works fine for millions of rows if properly indexes are built.

  3. If you hit some bottlenecks in the future with mysql, you may look at what exactly your problems are and scale them using cassandra. I mean you must be able to combine both approaches: SQL and noSQL in the same project.

With regards to mysql full-text index I can say that it's useless. I mean that it works too bad to be used in high-loaded projects. Look at sphinxsearch.com, which is a great implementation of full-text search made for sql databases.

But if you expect that your system grows fast and is going to serve millions of users, you should consider cassandra since the beginning.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜