开发者

Denormalize for Simplicity: Ungood idea?

After reading this q开发者_StackOverflowuestion, I've learned that denormalization is not a solution for simplicity. What about this case?

I have news-articles which have a list of sites-article-will-be-published-to. The latter can be expressed in normalized fashion either by table and a many-to-many relationship (via a cross-table, I think). But the simple solution is to just throw in a bunch of booleans for the sites-article-will-be-published-to (publish_to_site_1, publish_to_site_2 etc.). Assuming the sites are:

  1. small in number
  2. will not change over time
  3. have no fields themselves, except a name

Is this still a terrible idea? The many-to-many relationship seems somewhat cumbersome, but I've done it before in cases like this (and it seemed cumbersome).

Note: I'm doing this in Rails, where it's not that painful. On the other hand, the metaprogramming makes things like this trivial

(1..5).each { |site| do_something(article["publish_to_site_#{site}".to_symbol]) }


If these conditions are actually satisfied, then no, it's not a terrible idea.

In fact, this is not even denormalization: Denormalization usually means that you are storing some information redundantly, for sake of performance. In your example, since the sites do not have fields themselves, you are not storing stuff redundantly. You are just depriving yourself from the opportunity to store additional fields for the sites in the future (without violating normalization or redesigning your database).

So, this is OK (normalized):

article                        show_on_stackoverflow    show_on_my_blog
-----------------------------------------------------------------------
Denormalize for Simplicity             YES                     NO
More simplicity                        YES                     YES
...

But this is not OK (redundancy):

article                        show_on_stackoverflow    stackoverflow_mainpage_url   show_on_my_blog    my_blog_mainpage_url
------------------------------------------------------------------------------------------------------------------------------
Denormalize for Simplicity             YES              http://stackoverflow.com            NO          http://my.blog.url       
More simplicity                        YES              http://stackoverflow.com            YES         http://my.blog.url
...


Assumption two is unrealistic.

Therefore, in complete accordance with "If these conditions are actually satisfied, then no, it's not a terrible idea." : yes, it is a terrible idea.


If you think of the booleans "sites-article-will-be-published-to" as merely attributes of the primary datum such as "isGreen", "hasHair", "isBipedal", then a single table is normalized in the sense that it would be perverse to have a foreign key to the table Green{<true>, <false>}.

Obviously, if your 3 conditions do not continue to hold true, the next guy will have non-trivial work to do, but "as simple as possible, but no simpler" has its utility.


Personally, I think I'd not denormalize. I my opinion, a single n:n-relation is not that much cumbersome to join if you're familiar with SQL. What might be cumbersome is using the denormalized structure for different queries. For example, are you sure you're never going to need a list of all sites an article is published to...?

Not that I'd ever call your approach terrible, but I usually prefer normalized data, happily doing another join :)

Cheers Matthias

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜