A Never Delete Relational DB Schema Design
I am considering designing a relational DB schema for a DB that never actually deletes anything (sets a deleted flag or something).
1) What metadata columns are typically used to accomodate such an architecture? Obviously a boolean flag for IsDeleted can be set. Or maybe just a timestamp in a Deleted column works better, or possibly both. I'm not sure which method will cau开发者_StackOverflowse me more problems in the long run.
2) How are updates typically handled in such architectures? If you mark the old value as deleted and insert a new one, you will run into PK unique constraint issues (e.g. if you have PK column id, then the new row must have the same id as the one you just marked as invalid, or else all of your foreign keys in other tables for that id will be rendered useless).
If your goal is auditing, I'd create a shadow table for each table you have. Add some triggers that get fired on update and delete and insert a copy of the row into the shadow table.
Here are some additional questions that you'll also want to consider
How often do deletes occur. What's your performance budget like? This can affect your choices. The answer to your design will be different depending of if a user deleting a single row (like lets say an answer on a Q&A site vs deleting records on an hourly basis from a feed)
How are you going to expose the deleted records in your system. Is it only through administrative purposes or can any user see deleted records. This makes a difference because you'll probably need to come up with a filtering mechanism depending on the user.
How will foreign key constraints work. Can one table reference another table where there's a deleted record?
When you add or alter existing tables what happens to the deleted records?
Typically the systems that care a lot about audit use tables as Steve Prentice mentioned. It often has every field from the original table with all the constraints turned off. It often will have a action field to track updates vs deletes, and include a date/timestamp of the change along with the user.
For an example see the PostHistory Table at https://data.stackexchange.com/stackoverflow/query/new
I think what you're looking for here is typically referred to as "knowledge dating".
In this case, your primary key would be your regular key plus the knowledge start date.
Your end date might either be null for a current record or an "end of time" sentinel.
On an update, you'd typically set the end date of the current record to "now" and insert a new record the starts at the same "now" with the new values.
On a "delete", you'd just set the end date to "now".
- i've done that.
2.a) version number solves the unique constraint issue somewhat although that's really just relaxing the uniqueness isn't it.
2.b) you can also archive the old versions into another table.
精彩评论