Denormalize for Simplicity: Ungood idea?
After reading this q开发者_StackOverflowuestion, I've learned that denormalization is not a solution for simplicity. What about this case?
I have news-articles which have a list of sites-article-will-be-published-to. The latter can be expressed in normalized fashion either by table and a many-to-many relationship (via a cross-table, I think). But the simple solution is to just throw in a bunch of booleans for the sites-article-will-be-published-to (publish_to_site_1, publish_to_site_2 etc.). Assuming the sites are:
- small in number
- will not change over time
- have no fields themselves, except a name
Is this still a terrible idea? The many-to-many relationship seems somewhat cumbersome, but I've done it before in cases like this (and it seemed cumbersome).
Note: I'm doing this in Rails, where it's not that painful. On the other hand, the metaprogramming makes things like this trivial
(1..5).each { |site| do_something(article["publish_to_site_#{site}".to_symbol]) }
If these conditions are actually satisfied, then no, it's not a terrible idea.
In fact, this is not even denormalization: Denormalization usually means that you are storing some information redundantly, for sake of performance. In your example, since the sites do not have fields themselves, you are not storing stuff redundantly. You are just depriving yourself from the opportunity to store additional fields for the sites in the future (without violating normalization or redesigning your database).
So, this is OK (normalized):
article show_on_stackoverflow show_on_my_blog
-----------------------------------------------------------------------
Denormalize for Simplicity YES NO
More simplicity YES YES
...
But this is not OK (redundancy):
article show_on_stackoverflow stackoverflow_mainpage_url show_on_my_blog my_blog_mainpage_url
------------------------------------------------------------------------------------------------------------------------------
Denormalize for Simplicity YES http://stackoverflow.com NO http://my.blog.url
More simplicity YES http://stackoverflow.com YES http://my.blog.url
...
Assumption two is unrealistic.
Therefore, in complete accordance with "If these conditions are actually satisfied, then no, it's not a terrible idea." : yes, it is a terrible idea.
If you think of the booleans "sites-article-will-be-published-to" as merely attributes of the primary datum such as "isGreen", "hasHair", "isBipedal", then a single table is normalized in the sense that it would be perverse to have a foreign key to the table Green{<true>, <false>}
.
Obviously, if your 3 conditions do not continue to hold true, the next guy will have non-trivial work to do, but "as simple as possible, but no simpler" has its utility.
Personally, I think I'd not denormalize. I my opinion, a single n:n-relation is not that much cumbersome to join if you're familiar with SQL. What might be cumbersome is using the denormalized structure for different queries. For example, are you sure you're never going to need a list of all sites an article is published to...?
Not that I'd ever call your approach terrible, but I usually prefer normalized data, happily doing another join :)
Cheers Matthias
精彩评论