开发者

How to keep historic details of modification in a database (Audit trail)?

I'm a J2EE developer & we are using hibernate mapping with a PostgreSQL database.

We have to keep track of any changes occurs in the database, in others words all previous & current values of any field should be saved. Each field can be any type (bytea, int, char...)

With a simple table it is easy but we a graph of objects things are more difficult.

So we have, speaking in a UML point of view, a graph of objects to store in the database with every changes & the user.

Any idea or pattern 开发者_高级运维how to do that?


A common way to do this is by storing versions of objects.

If add a "version" and a "deleted" field to each table that you want to store an audit trail on, then instead of doing normal updates and deletes, follow these rules:

  • Insert - Set the version number to 0 and insert as normal.
  • Update - Increment the version number and do an insert instead.
  • Delete - Increment the version number, set the deleted field to true and do an insert instead.
  • Retrieve - Get the record with the highest version number and return that.

If you follow this pattern, every time you update you will create a new record rather than overwriting the old data, so you will always be able to track back and see all the old objects.

This will work exactly the same for graphs of objects, just add the new fields to each table within the object graph, and handle each insert/update/delete for each table as described above.

If you need to know which user made the modification, you just add a "ModifiedBy" field as well.

(You can either do this processing in your DA layer code, or if you prefer you can use database triggers to catch your update/delete/retrieve calls and re-process them following the rules.)

Obviously, you need to consider space requirements, as every single update will result in a fully new record. If your application is update heavy, you are going to generate a lot of data. It's common to also include a "last modified time" fields so you can process the database off line and delete data older than required.


Current RDBMS implementations are not very good at handling temporal data. That's one reason why maintaining separate journalling tables through triggers is the usual approach. (The other is that audit trails frequently have different use cases to regular data, and having them in separate tables makes it easier to manage access to them). Oracle does a pretty slick job of hiding the plumbing in its Total Recall product, but being Oracle it charges $$$ for this.

Scott Bailey has published a presentation on temporal data in PostgreSQL. Alas it won't help you right now but it seems like some features planned for 8.5 and 8.6 will enable the transparent storage of time-related data. Find out more.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜