Storing Revisions of Relational Objects in an Efficient Way
I'm not sure if this type of question has been answered before. In my da开发者_运维知识库tabase I have a product table and specifications table. Each product can have multiple specifications. Here I need to store the revisions of each product in database in order to query them later on for history purposes.
So I need an efficient way to store the products' relations to specifications each time users make changes to these relations. Also the amount of data can become very big. For example, suppose there are 100000 products in database: each product can have 30 specifications and also there are minimum of 20 revisions on each product. So by storing all the data in a single table the amount of data becomes enormously high.
Any suggestions?
If this is purely for 'archival' purposes then maybe a separate table for the revisions is better.
However if you need to treat previous revisions equally to current revisions (for example, if you want to give users the ability to revert a product to a previous revision), then it is probably best to keep a single products table, rather than copying data between tables. If you are worried about performance, this is what indexes are for.
You can create a compound primary key for the product table, e.g. PRIMARY KEY (product_id, revision)
. Maybe a stored proc to find the current revision—by selecting the row with the highest revision
for a particular product_id
—will be useful.
I would recommend having a table, exact copy of current table with a HistoryDate column, and store the revisions in this table. This you can do for all 3 tables in question.
By keeping the revision separate from the main tables, you will not incur any performance penalties when querying the main tables.
You can also look at keeping a record to indicate the user that changed the data.
精彩评论