开发者

Use uuid as primary and/or as surrogate key?

We are required to add UUID's to most of our objects and database tables.

Would you use the UUID as surrogate key, or rather as natural key in addition to a sequence generated surrogate key, i.e. use a private surrogate key and in addition add a column/attribute to hold the UUID?

I see that it is often used directly as surrogate / primary key. Somehow I don't like the idea.

One might view a UUID as a natural key, since it should be a unique identifier with a global meaning just like any other natural key, independent of a particular implementation of a system, i.e. if you would ever move your data to another system, the UUID must stay the same, whereas surrogate keys by definition have no real and lasting meaning.

Maybe I should have clarified more: supp开发者_如何学Goose we have an Account table. Traditionally there would be some internal surrogate key and a natural key consisting of the account number (as printed on account statements etc.).

Whereas the UUID is not as "readable" as the account number, I would view a UUID more like the natural key because it could serve the same purpose as the account number: to refer to a particular account in a unique and unchanging manner. The (traditional) surrogate key never appears outside of the system, since it is completely private and could be changed at any time, no external references must ever exist to it.

In that sense a UUID is not a typical surrogate key (?).


You are mixing things up a bit.

1) There are two definitions of surrogate keys

Surrogate (1)

This definition is based on that given by Hall, Owlett and Todd (1976). Here a surrogate represents an entity in the outside world. The surrogate is internally generated by the system but is nevertheless visible to the user or application.

Surrogate (2)

This definition is based on that given by Wieringa and De Jonge (1991). Here a surrogate represents an object in the database itself. The surrogate is internally generated by the system and is invisible to the user or application.

The surrogate (1) definition defines its usage in the data model rather than the storage model and is used in this article. See Date (1998).

(from wiki's entry on surrogate keys; read the article with a bit of scepticism - for example quote 'Surrogate keys are less expensive to join (fewer columns to compare) than compound keys' might seem reasonable on the surface, but natural compound keys will create indexes that are naturally ordered and segregated, allowing for very efficient scans when browsing or analyzing data, also due to the same logic joins that return resultsets containing several rows can actually perform much better)

Anyway, when considering surrogate keys from the perspective of the data model, you should not consider what you call a 'traditional' definition.

2) Your logic for considering UUIDs natural keys is very slippery

quoting from your question:

I would view a UUID more like the natural key because it could serve the same purpose as the account number: to refer to a particular account in a unique and unchanging manner.

This is not a defining nor distinguishing characteristic of natural keys vs surrogate keys. Natural keys have following properties (from wiki):

A natural key is a candidate key that has a logical relationship to the attributes within that row. A natural key is sometimes called a domain key.

The main advantage of a natural key over a surrogate key, which has no such logical relationship, is that it already exists; there is no need to add a new, artificial column to the schema. Using a natural key (when one can be identified) also simplifies data quality: It ensures that there can only be one row for a key; this "one version of the truth" can be verified, because the natural key is based on a real-world observation.

Normally there is no logical relationship between UUID and the attributes of the same row. However, if UUIDs are assigned by an external system and if you already have a requirement to store them as an attribute then you have that logic (similarly like you could consider a serial number or social security number a natural key).

Only in this sense UUID might stop being surogate key and yet still you might have (and probably will have) stronger and richer logic for another candidate key for the same row.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜