开发者

data redundancy

Can referential integrity constraints help in addressing data redundancy probl开发者_如何学运维ems?


Referential integrity constraints are only a subset of "database constraints in general".

Normalization and database constraints are distinct-but-intertwined concepts.

Say you have a table CUSTOMERORDER (custID, custName, orderID), which says that "the customer identified by #custID# and who is named #custName# has placed the order identified by #orderID#".

This table is unlikely to be in 3NF because of the FD custID->custName that probably applies. But say we keep this one-table design nonetheless. What do we then have to do to enforce consistency of the data ? We have to enforce the mentioned FD. We have to see to it that if the same customer places a second order, then the custName in the two rows will be identical. We have to prohibit rows such as (1, Smith, 2) and (1, Jones, 7) to appear both in the table. That is a kind of database constraint to be enforced, in order to make our design match all the stated business rules.

But note that we do not have any "referential" constraint here. Obviously, because there is no second table to reference.

Also note in passing that this one-table design "automatically" enforces some other constraints that might not be immediately obvious. For example, our one-table design makes it impossible for an orderID to exist without a corresponding custID AND custName to also exist. (If you are thinking about nulls, stop doing so. In relational theory, there does not exist a thing such as 'null".) The "rule" that if an orderID is registered, then there must also exist a corresponding custID PLUS custName, is enforced "implicitly" by our design being a one-table one.

But now we decompose our design into a two-table one, as traditional normalization theory prescribes it :

CUSTOMER(custID, custName) KEY custID; ORDER(custID, orderID) KEY custID,orderID ;

The business rules we have to enforce are still the same, namely : (a) there cannot be two customers with the same custID but with a different name (that's our FD), and (b) there cannot be any order without a corresponding custID PLUS custName for that order.

Let's see how our two-table design handles these business rules. (a) is obviously enforced by declaring custID as being a key on CUSTOMER. As for (b), it is obvious that it will be impossible to record an orderID in ORDER without also recording a custID. But is that sufficient to guarantee that there will also be a corresponding custName for all ORDER rows ? Obviously no. That's why we need to introduce the obvious referential integrity rule between ORDER and CUSTOMER.

Thus, RI constraints indeed "help addressing data redundancy problems", in the sense that by decomposing a table, and introducing a RI constraint to the overall design, they make it possible to eliminate certain kind of redundancies while preserving certain guarantees of data integrity. Without the possibility to introduce RI constraints in a design, we'd only be eliminating redundancy at the expense of data consistency.


It can help, but not if the database design isn't normalized. Referential integrity constraints can be used in your design to reduce/remove data redundancy.

For best effect, ensure you normalize to BCNF over 3NF. This may still have some redundancy, but for most uses will be fine.


Referential integrity guarantees only referential integrity.

It's how you lay out your database that prevents redundancy (see normalization as Oded pointed out).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜