Sharding at application level

2023-02-20 22:19 问答作者：

I am designing a multi-tenant system and am considering sharding by tenant at the application layer level instead of database.

Hypothetically, the way this should work is that for incoming request a router process has a global collection of tenants containing primary attributes to determine the tenant for this request as well as the virtual shard id. This virtual shard id is further mapped to an actual shard.

The actual shard contains both the code for application as well as whole data for this tenant. These shards would be LNMP (Linux, Nginx, MySQL/MongoDB, PHP) servers.

The router process should act as proxy. It should be able to run some code to determine the target shard for incoming request based on the collection stored in some local db or files. To be able to scale this better, i am considering making the shards themselves act as routers also so that they can run a revers开发者_如何学Goe proxy that will forward the request to appropriate shard. Maybe the nginx instance running on shard can also act as that reverse proxy. But how will it execute the application logic needed to match up the request with the appropriate shard.

I will appreciate any ideas and suggestions for this router implementation.

Thanks

Another option would be to use a product such as dbShards. dbShards is the only sharding product that shards at the application level. This way you can use any RDMS (Postgres, MySQL, etc.) and still be able to shard your database without having to put some kind of proxy in-between. A lot of the other sharding products rely on a proxy to point the transactions to the correct shard, but dbShards knows where to go without having to "ask" anyone else.

Great product. dbshards

Unless you expect your tenants to generate approximately equal data volume, sharding by tenant will not be very efficient.

As to application level sharding in general, let me share my own experience:

Version 1 of our high-volume SaaS product sharded at the application level. You will find that resharding as you grow will be a major headache if you shard against a SQL type solution at the application level, or you will have to write significant tooling to automate the process.

We switched to MongoDB (after considering multiple alternatives including Cassandra) in no small part because of all of the built-in support for resharding / rebalancing as data grows.

If your application does not need the relational capabilities of MySQL, I would suggest concentrating your efforts on MongoDB (since you have already identified that as a possible data platform) if you expect more than modest data growth. Allow MongoDB to handle the data sharding.

继续阅读：nginx sharding

Sharding at application level

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？