What is the best schema design for child collection with a "primary" entity

2022-12-30 09:11 问答作者：

Here is the scenario: You have a Persons table with a one-to-many relationship with an Addresses table, where one of the Address rows is the "primary" Address.

Is it better in a normalized schema to

Use a Persons.PrimaryAddressID to access the "primary" Address for a Person

Use an Addresses.IsPrimary bit column to reference the "primar开发者_开发问答y" Address for a Person via Addresses.PersonID

Other

and why?

It depends on whether the person-to-address relationship is one-to-zero-plus or one-to-one-plus.

If a person is required to have a primary address, I would put it in the Persons table itself (since it's a required attribute).

In the other hand, if a person can exist in your schema without an address, I would leave all addresses in the Addresses table as equal and use an attribute of the Persons table to select the primary (either NULL or a pointer to the relevant Addresses row).

If you store the primality of an address in the Addresses table, what do you do when two addresses for Bob Smith both claim to be the primary? You could stop that with triggers but it's far more efficient to design your schema properly.

And, if two room-mates share the same address, but one lives there all the time and the other spends most of his time shacked up with his girlfriend, what happens then? If the primality is in the Addresses table, you won't be able to share address rows between persons.

What I'm trying to get across is that you need to allocate your attributes to the right objects. A person's primary address belongs to a person, not an address.

For maximum efficiency and flexibility, I would have the following schema:

Persons:
    Id                 primary key
    PrimaryAddressId
    OtherStuff
Addresses:
    Id                 primary key
    OtherStuff
PersonAddresses:
    Id                 primary key
    PersonId           foreign key on Persons(Id)
    AddressId          foreign key on Addresses(Id)

You have the minor data integrity problem that Persons.PrimaryAddressId may be a hanging pointer. You can't make it a foreign key to one of the primary keys since you want it to allow NULL. That means you'll have to cater for the possibility that it might point to a non-existent Addresses.Id.

I would simply fix that as a before-delete trigger on Addresses so that the relevant Persons rows are updated (setting PrimaryAddressid to NULL).

Or you could be tricky and have one address of "Unknown" in the Addresses table so that every row in Persons has at least one address (those whose primary address is unknown automatically get their PrimaryAddressid set to the "Unknown" address row.

Then you could make it a proper constrained relationship and simplify your SQL somewhat. Pragmatism often beats dogmatism in the real world :-)

I would go for "Use a Persons.PrimaryAddressID to access the "primary" Address for a Person". Primary Address has a meaning only when Person is linked to Addresses. So it should belong to Person. Think about following scenarios where 2nd approach fails.

a) Address is used with another entity without referenced to person where Addresses.IsPrimary is meaningless.

b) Same address is used by two persons where 1st uses as primary while 2nd not.

If you want the constraint to be that one person has at most one primary address, a Persons.PrimaryAddressID is clearly simpler -- indeed, it's enforced by the very schema. It's also easy to ensure exactly one primary address per person (just make that column be not null), and if you need to, even say that no two people can share a primary address (just make that column be unique).

Of course, exactly because this approach excels at enforcing such simple constraints, it's bad when you don't want these constraints -- for example, if you want to make it possible for a person to have more than one "primary" address, the approach in question would not work.

Incidentally, I would not consider a one-Person/many-addresses relationship to be particularly good unless you want to enforce the fact that no two people can share the same address: in general, in a normalizing mood, I'd rather have a table of people, one of addresses, and one for the relationship (which in most contexts would naturally be many-to-many, since many people can, and in real life do, share the same address).

If you choose to go this route, then, especially if you need high flexibility (multiple primary addresses &c), having the relationship table carry the "primality" boolean would be an attractive choice (it still makes it not too hard to enforce some of the above-mentioned constraints, though other constraints, such as "an address belongs to at least one person" or vice versa, can be tricky to express simply).

Lesson to retain: express exactly what constraints your schema needs to express simply, and the right schema for that purpose often emerges quite clearly. If the constraints of interest are a mystery, so will be the answer to the question "what's the right schema";-).

继续阅读：database-design database-schema

What is the best schema design for child collection with a "primary" entity

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？