开发者

What is the best way to implement Polymorphic Association in SQL Server?

I have tons of instances where I need to implement some sort of Polymorphic Associa开发者_JAVA百科tion in my database. I always waste tons of time thinking through all the options all over again. Here are the 3 I can think of. I'm hoping there is a best practice for SQL Server.

Here is the multiple column approach

What is the best way to implement Polymorphic Association in SQL Server?

Here is the no foreign key approach

What is the best way to implement Polymorphic Association in SQL Server?

And here is the base table approach

What is the best way to implement Polymorphic Association in SQL Server?


Another common Name for this model is the Supertype Model, where one has a base set of attributes that can be expanded via joining to another entity. In Oracle books, it is taught both as a logical model and physical implementation. The model without the relations would allow data to grow into invalid state and orphan records I would strongly validate the needs before selecting that model. The top model with the relation stored in the base object would cause nulls, and in a case where fields were mutually exclusive you would always have a null. The bottom diagram where the key is enforced in the child object would eliminate the nulls but also make the dependency a soft depenendency and allow orphans if cascading was not enforced. I think assessing those traits will help you select the model that fits best. I have used all three in the past.


I used the following solution to solve a similar problem :

Many-Many based design : Even though the relation is a 1-Many between an ObjectN and Something, it is equivalent to a Many-Many relationship with a modification of the PK of the relation table.

First i create a relation table between an ObjectN and Something per Object and then i use the Something_ID column as the PK.

This is the DDL of the Something-Object1 relationship which is the same for Object2 and Object3 as well :

CREATE TABLE Something
(
    ID INT PRIMARY KEY,
    .....
)

CREATE TABLE Object1
(
   ID INT PRIMARY KEY,
   .....
)

CREATE TABLE Something_Object1
(
    Something_ID INT PRIMARY KEY,
    Object1_ID INT NOT NULL,
    ......

    FOREIGN KEY (Something_ID) REFERENCES Something(ID),
    FOREIGN KEY (Object1_ID) REFERENCES Object1(ID)
)

More details and examples of other possible options in this ticket multiple-foreign-keys-for-the-same-business-rule


The two most common approaches are Table Per Class (i.e. a table for the base class and another table for each subclass that contains the additional columns necessary to describe the subclass) and Table Per Hierarchy (i.e. all columns in one table, with one ore more columns to allow for the discrimination of subclasses). Which is the better approach really depends on the particulars of your application and data access strategy.

You would have Table Per Class in your first example by reversing the direction of the FK and removing the extra ids from the parent. The other two are essentially variants of table per class.


According to me your first type of approach is the best way you can define the data as well as your classes but As your all primary data should be avail for the child.

So you can check your requirement and define the Database.


Approach 1 is best but association between something and object1, object2 ,object3 should be one to one.

I mean FK in child (object1, object2, object3) table should be non null unique key or Primary key for child table.

object1, object2 ,object3 can have Polymorphic object value .


I have used what I guess you would call the base table approach. For example, I had tables for names, addresses, and phonenumbers, each with an identity as PK. Then I had a main entity table entity(entityID), and a linking table: attribute(entityKey, attributeType, attributeKey), wherein the attributeKey could point to any of the first three tables, depending on the attributeType.

Some advantages: allows as many names, addresses, and phonenumbers per entity as we like, easy to add new attribute types, extreme normalization, easy to mine common attributes (i.e. identify duplicate people), some other business-specific security advantages

Disadvantages: quite complex queries to build simple result sets made it difficult to manage (i.e. I had trouble hiring people with good enough T-SQL chops); performance is optimal for VERY specific use cases rather than general; query optimization can be tricky

Having lived with this structure for several years out of much longer career, I would hesitate to use it again unless I had the same weird business logic constraints and access patterns. For general usage, I strongly recommend your typed tables directly reference your entities. That is, Entity(entityID), Name(NameID, EntityID, Name), Phone(PhoneID, EntityID, Phone), Email(EmailID, EntityID, Email). You will have some data repetition and some common columns, but it will be much easier to program to and optimize.


There is no single or universal best practice to achieve this. It all depends on the type of access the applications will need.

My advice would be to make an overview on the expected type of access to these tables:

  1. Will you use an OR layer, stored procedures or dynamic SQL?
  2. What numbers of records do you expect?
  3. What is the level of difference between the different subclasses? How many columns?
  4. Will you be doing aggregations or other complex reporting?
  5. Will you have a data warehouse for reporting or not?
  6. Will you often need to process records of different subclasses in one batch? ...

Based on answers to this questions, we could work out an appropriate solution.

One additional possibility to store properties specific to subclasses is to use a table with Name/value pairs. This approach can be particularly useful if there is a large number of different subclasses or when the specific fields in the subclasses are used infrequently.


I have used the first approach. Under extreme loads the "Something" table becomes a bottleneck.

I took the approach of having template DDL for my different objects with the attribute specializations being appended to the end of the table definition.

At the DB level if I genuinely needed to represent my different classes as a "Something" recordset then I put a view over the top of them

SELECT "Something" fields FROM object1
UNION ALL
SELECT "Something" fields FROM object2
UNION ALL
SELECT "Something" fields FROM object3

The challenge is as to how you assign a non-clashing primary key given that you have three independent objects. Typically people use a UUID/GUID however in my case the key was an 64 bit integer generated in an application based on a time and machine in order to avoid clashes.

If you take this approach then you avoid the problem of the "Something" object causing locking/blocking.

If you want to alter the "Something" object then this can be awkward now you have three independent objects, all of which will require their structure to be altered.

So to summarise. Option One will work fine in most cases however under seriously heavy load you may observe locking blocking that necessitates splitting out the design.


Approach 1 with multiple columns foreign keys is the best one. Because that way you can have pre-defined connections with other tables And that makes it much easier for scripts to select, insert and update data.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜