Would you consider this a natural primary key?
I'm writting a program with users who are identified by an id called RFC. In Mexico, this RFC is a 13 characters string used to identify tax payers in our country. No one, in the whole country, has the same RFC so I thought this would be a perfect natural primary key. This RFC would serve as a foreign key in other tables.
The thing is, I'm concerned about system performance. Do you think it would be better to use an aut开发者_运维百科o incremented integer value associated to every RFC?
Cheers!
Before you worry about perfomrance, think about privacy. Is RFC in Mexico like social security # in the USA? If so, you definately don't want to build a system that depends on it since you may be forced to treat it differently/encrypt it etc.
I would suggest an auto-incrmenting integer key and store RFC as needed for your privacy needs.
My thought is that anything subject to political whims, including "wow, there'll never be more than one person using this number" is a red flag for creating your own primary key (auto increment int).
Are you certain that only Mexican taxpayers will be users of this system? That is, are you certain that only individuals with an RFC will be users of this system, now and in the future?
For this reason and (less importantly) for performance reasons, I think I'd go for an autoincrementing integer ID.
As a general rule I prefer to use an auto incrementing integer as the row id as a primary key and foreign keys. Thats not to say you wouldn't index on the RFC for quick searching. But you might run into cases where an RFC is incorrect and needs to be changed... if its the primary key and foreign key, then it has to be changed all over the place.
Using an auto-incrementing integer wont change what your queries looks like much, and on a practical side, the small sized (in characters) numbers might make debugging easier.
It seems that in Spain there might be about 105,000 people sharing the same "unique" ID card number with someone else (out of a population of about 45,000,000). When people die, their ID numbers might be reused, so that might also be a problem for your data set. And then there are people with no ID numbers at all.
I'd personally go for UUIDs. But don't forget to write some tests: Recently I got bitten by an implementation bug in Python 2.5.
This is really an information requirements question and not a database design question.
Are you required to identify relevant persons by RFC? Are you required to enter people who have no RFC, or whose RFC is already assigned to another person previously entered?
Or are you free to make up your own requirements?
Once you know the requirements, the design will flow fairly easily.
You state "No one, in the whole country, has the same RFC". Is it possible for a person to have more than one RFC during their life time? Not sure about Mexico, but here in NZ, our Tax Department issues a person a new Tax Number when they are declared bankrupt. This means that they have 2, or more, numbers. Just something else for you to check.
精彩评论