开发者

Is it necessary to have a Primary key which is sequence generated even though I may not use it for DB operations

Hi I had parent table which has ID and some other columns and a child table which have many values based on this ID(foreign Key). I wanted to create the table with a Primary Key which is sequence and this Parent table ID as Foreign Key but later I found I had one more Foreign Key EMPID which on combination provides uniqueness. Even for retrieval and batch update i can simply use the combination of these two FKs. So do I 开发者_如何学Pythonstill need to use the sequence generator keeping in mind DB performance or just drop the Primary Column.


If you are sure you will NEVER EVER need or want to reference that sequence-generated key (and that no other objects will ever reference that ID), then it should be safe to remove it, as long as you can gaurantee that your combination of foreign keys will always provide a unique combination.

If you do notice performance problems, you may want to index the columns that are foreign keys (that will now, in combination, act as a primary key) to improve performance on querying those columns.


Relational purists would tell you that you never need a sequentially generated ID column at all - that the Primary Key in each table should be that combination of columns which together uniquely identify each row. (If there are multiple combinations of columns which do this, then pick one to be the primary key, and enforce the others using unique constraints).

Where this tends to come unstuck are where one or more of these columns are large or unwieldy, or where the number of such columns is making the primary key itself large and unwieldy. Here's where I diverge from being a relational purist, and say go ahead, add a surrogate key (which is what an IDENTITY/Sequence ID column is more properly referred to as). But make sure that the other keys are still being enforced using uniqueness constraints.

So sure, there's no reason you have to have an ID column in the table. If every table had to have one, you'd think the DB products would provide it for you automatically.


(I'd have put this in a comment, but it's too long.)

Primary keys in theory need not be immutable; however, in the real world, having mutable primary keys is a nightmare for two reasons:

  1. As the relational model grows, every child table must include every column of the parent's primary key.
  2. Primary keys, like all unique constraints, are enforced by indexes.
  3. Some DBMS's simply can't have enough columns in an index to satify such natural key constructions.
  4. Similarly, length requirements exist on the data length of index keys.
  5. Lastly, mutable primary keys must cascade throughout the entire system.

In short:

If your key values might ever change

-- a credit card number or account number is a bad primary key, because it can change, and a name is a really bad choice

or

you have any data dependent on this table through foreign keys (it's pretty easy to refactor this when the requirement arises)

then

Create a surrogate key as the primary key. It's great that you have found the natural key of the table; put a unique constraint on those columns too.


For consistency I like to have a surrogate key (an auto generated identity/sequenced primary key) in all of my tables. The overhead is minimal and it allows for a consistent programmer interface with the database structure.


I would say, it is better to impose all the restrictions and constraints on the database design. They are useful, atleast if you are in Oracle DB. Refer Tom Kyte (askTom fame). According to him, these references/constraints are used in deciding execution path.

It is interesting to note how important constraints are for query optimization. Many people think of constraints as a data integrity thing, and it’s true—they are. But constraints are used by the optimizer as well when determining the optimal execution plan. The optimizer takes as inputs.

  • The query to optimize
  • All available database object statistics
  • System statistics, if available (CPU speed, single-block I/O speed, and so on—metrics about the physical hardware)
  • Initialization parameters
  • Constraints

It could end-up in yielding better performance.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜