Multi version entity database design
I am currently working on a project in which we need to keep multiple versions of a given entity in the database.
For example let’s assume we have an entity called House which has rooms. Now whenever a room is added, updated or deleted from the house, or a property on the house or any of the rooms is changed, we need to keep both the old and new information. Preferably we would like to group changes together in a “change set” (really the problem is sort of like source control).
Now the actual model is quite a bit more complicat开发者_StackOverflowed than this, but I think this example illustrates the problem.
I would very much like to hear some possible solutions to this problem.
We are running Microsoft SQL Server 2008 R2.
In data warehouses, this is generally solved using a Slowly Changing Dimension (SCD).
As a brief summary, you can:
- Make a surrogate key (an instance id) and use the natural key and a version number (or effective date/time range) to keep track of changes (Type 2 SCD)
- Have N columns in your room table for N versions (Type 3 SCD)
- Have a separate "rooms_history" table and only keep the current instance in the rooms table (Type 4)
My team has traditionally used a combination of Type 2 (for more "recent" data) and Type 4 (for "older" data in an archive). We group changes using a change_id
in the objects being changed that is a reference to a change
object that includes the date of the change as well as other metadata (user, comments, etc.)
精彩评论