开发者

drawbacks of storing all ''things' in a central table

I am not sure if there is a term to describe this, but I have observed that content management systems store all kinds of data in a single table with their bare minimum properties while the meta data is stored in another table in form of key value pairs.

for eg. everything (blog posts, pages, images, events etc) is stored in one table and considered as a post.

I understand that this allows for abstraction and easy extensibility

we are considering designing our new project this way. It is not exactly a CMS but we plan to keep adding modules to it in stages. Lets say initially there 开发者_运维技巧will be only posts and images on which comments can be posted. Later on we might add videos which will also have the commenting feature.

what are the drawbacks of this approach ? and will it work for a requirement like ours ?

Thanks


The drawback is that the main table will get zillions of reads (and plenty of writes, too).

This means that there will be lots of lock contentions, heavy reindexing etc.

In order to mitigate this a bit you may consider splitting the "main table" in a series of not-so-main-tables.

Say, you will have one main table for "Posts" (possibly refined through metadata or subtables for specific types of posts, like Sticky, Announcement, Shoutbox, Private...)

One main table for Images (possibly refined for gifs, jpegs etc.)

One main table for Videos...

If this is a custom application (and not intended to be something that has to be "infinitely tweakable" like a CMS or a Portal framework) I think this kind of split is acceptable, and may provide some better performance (if you expect to have large amounts of data).

Regarding your "examples" comment... first of all, if you keep comments again in a single gigantic table you may have similar problems as if you kept all type of items in it. Assuming this is not a problem, you can obviously put a sort of reference key (you can't use the normal foreign keys, of course) that links comments to their original item.

This works fine when you go from item to comments, a bit less when you have to move from comments to the originating item. So the tradeoff is about what kind of operations would be more frequent for your problem.


Simplicity and extensibility are indeed often attractive aspects of attribute-value and (as you say) "single table of things" approaches.

There's no 100% right answer here -- depending on your performance/throughput goals and extensibility needs, this approach might work for you too.

In most cases, however, where you know what kinds of data you will store, it's usually in your interest to model distinct entities into their own tables and relate the data accordingly. RDBMSes have been architected and refined over decades to cater to this use case and to simply use tables as generic dumping grounds doesn't typically buy you any distinct advantages, except the act of delaying the inevitable need to model your data properly. Furthermore, when you boil everything into one table, you then force users outside your app itself (if you have any, for example report writers) to have to struggle with your "model within a model", which can just make folks frustrated when they write queries, etc. And you will sink to your lowest common denominator -- if you want to optimize queries about type X and you have types Y and Z in that same table in droves, they will impact performance on querying X.

Again, to be clear, there is distinct benefit to the "all things in one table" name/value style metadata approaches. I have used them myself and turned against modeling for similar reasons. However, my advice is to limit yourself to times when you really need to do that (i.e., you need to implement something before you can correctly model the space of things you will need). Most typically, I find myself doing that when I'm prototyping complex systems and I need to get something going sooner than later.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜