Gem that allows for data access using sharded mysql databases while maintaining the usage of Activerecord

2023-01-14 01:58 问答作者：

This is a relatively complex problem that I am thinking of, so please suggest edits or comment on parts where you are not clear about. I will update and iterate based on your comments

I am thinking of a developing a rails gem that simplifies the usage of sharded tables, even when most of your data is stored in relational databases. I believe this is similar to the concept being used in Quora or Friendfeed when they hit a wall scaling w traditional mysql, with most of the potential solutions requiring massive migration (nosql), or just being really painful (sticking w relational completely)

http://bret.appspot.com/entry/how-friendfeed-uses-mysql
http://www.quora.com/When-Adam-DAngelo-says-partition-your-data-at-the-application-level-what-exactly-does-he-mean?q=application+layer+quora+adam+

Essentially, how can we continue using MySQL for a lot of things it is really good at, yet allowing parts of the system to scale? This will allow someone got started using mysql/activerecord, but hit a roadblock scaling to easily scale the parts of the database that makes sense.

For us, we are using Ruby on Rails on a sharded database, and storing JSON blobs in them. Since we cannot do joins, we are creating tables for relationships between entities.

For example, we have 10 different type of entities. Each entity can be linked to each other using a big (sharded) relationship tables.

The tables are extremely simple. The indexes is (Id1, Id2..., type), and da开发者_如何学Cta is stored in the JSON blob.

Id, type, {json data}
Id1, Id2, type {json data}
Id1, Id2, Id3, type {json data}

We have put a lot of work into creating higher level interfaces for storing a range of data sets for relational data

For any given type, you can define a type of storage - (value, unweighted list, weighted lists, weighted lists with guids)

We have higher level interfaces for each of them - querying, sorting, timestamp comparison, intersections etc.

That way, if someone realizes that they need to scale a specific part of the database, they can keep most of their infrastructure, and move only the tables they need into this sharded database

What are your thoughts? As mentioned above, I would love to know what you folks think

Scalability is a tough nut to crack. My background includes two years as a sales engineer for BEA systems, back when all they sold was the TUXEDO middleware (TUXEDO == Transactions for UNix Extended for Distributed Operations). TUXEDO is still the king of the TPC-C benchmark on Unix platforms.

Scaling WRT a database is not so much about the database itself, it's about how you access that database. For example, if you establish a connection to a database, and you want that single connection to scale, make that connection access the same table in the database always. The problem with today's infrastructures (RoR included) is that when they open generic connections, those connections accesses many tables in the database.

So if you want to make a database CONNECTION scale, make that connection focus the database engine on as few database resources as possible. If you can manage to create a 'focused' connection, that ONLY accesses one table, and one table index, for example, it will scale much better than a connection that accesses EVERY table in the database and every index defined for all those tables.

继续阅读：nosql ruby-on-rails scalability

Gem that allows for data access using sharded mysql databases while maintaining the usage of Activerecord

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？