Oracle - Triggers to create a history row on update
First, we currently have the behavior that's desired, but it's not trivial to maintain when any changes to the database are needed. I'm looking for anything simpler, more efficient, or easier to maintain (anything that does any of those 3 would be most welcome). When we perform an update, a history row is created that is a copy of the current row, and the current row's values are then updated. The result being that we have a history record of how the row was before it was updated.
Reasoning: We have to be compliant with a number of federal rules, and went this route to have a full audit history of everything, as well as we can look at the database at any point in time and see how things looked (future requirement). For similar reasons, I cannot change how history is recorded...any solution must result in the same data as the current triggers create.
Here's what the current triggers look like for the Contact
Table:
Before update (each row):
DECLARE
indexnb number;
BEGIN
:new.date_开发者_开发百科modified := '31-DEC-9999';
indexnb := STATE_PKG.newCONTACTRows.count + 1;
:new.date_start := sysdate;
:new.version := :old.version + 1;
state_pkg.newCONTACTRows(indexnb).ID := :old.ID;
state_pkg.newCONTACTRows(indexnb).PREFIX := :old.PREFIX;
state_pkg.newCONTACTRows(indexnb).FIRST_NAME := :old.FIRST_NAME;
state_pkg.newCONTACTRows(indexnb).MIDDLE_NAME := :old.MIDDLE_NAME;
state_pkg.newCONTACTRows(indexnb).LAST_NAME := :old.LAST_NAME;
--Audit columns after this
state_pkg.newCONTACTRows(indexnb).OWNER := :old.OWNER;
state_pkg.newCONTACTRows(indexnb).LAST_USER := :old.LAST_USER;
state_pkg.newCONTACTRows(indexnb).DATE_CREATED := :old.DATE_CREATED;
state_pkg.newCONTACTRows(indexnb).DATE_MODIFIED := sysdate;
state_pkg.newCONTACTRows(indexnb).VERSION := :old.VERSION;
state_pkg.newCONTACTRows(indexnb).ENTITY_ID := :old.id;
state_pkg.newCONTACTRows(indexnb).RECORD_STATUS := :old.RECORD_STATUS;
state_pkg.newCONTACTRows(indexnb).DATE_START := :old.DATE_START;
END;
Before update (once for all rows):
BEGIN
state_pkg.newCONTACTRows := state_pkg.eCONTACTRows;
END;
After update (once for all rows):
DECLARE
BEGIN
for i in 1 .. STATE_PKG.newCONTACTRows.COUNT loop
INSERT INTO "CONTACT" (
ID,
PREFIX,
FIRST_NAME,
MIDDLE_NAME,
LAST_NAME,
OWNER,
LAST_USER,
DATE_CREATED,
DATE_MODIFIED,
VERSION,
ENTITY_ID,
RECORD_STATUS,
DATE_START)
VALUES (
CONTACT_SEQ.NEXTVAL,
state_pkg.newCONTACTRows(i).PREFIX,
state_pkg.newCONTACTRows(i).FIRST_NAME,
state_pkg.newCONTACTRows(i).MIDDLE_NAME,
state_pkg.newCONTACTRows(i).LAST_NAME,
state_pkg.newCONTACTRows(i).OWNER,
state_pkg.newCONTACTRows(i).LAST_USER,
state_pkg.newCONTACTRows(i).DATE_CREATED,
state_pkg.newCONTACTRows(i).DATE_MODIFIED,
state_pkg.newCONTACTRows(i).VERSION,
state_pkg.newCONTACTRows(i).ENTITY_ID,
state_pkg.newCONTACTRows(i).RECORD_STATUS,
state_pkg.newCONTACTRows(i).DATE_START
);
end loop;
END;
The package defined as (trimmed, full version is just copy of this per table):
PACKAGE STATE_PKG IS
TYPE CONTACTArray IS TABLE OF CONTACT%ROWTYPE INDEX BY BINARY_INTEGER;
newCONTACTRows CONTACTArray;
eCONTACTRows CONTACTArray;
END;
The current result
Here's a resulting history sample:
ID First Last Ver Entity_ID Date_Start Date_Modified
1196 John Smith 5 0 12/11/2009 10:20:11 PM 12/31/9999 12:00:00 AM
1201 John Smith 0 1196 12/11/2009 09:35:20 PM 12/11/2009 10:16:49 PM
1203 John Smith 1 1196 12/11/2009 10:16:49 PM 12/11/2009 10:17:07 PM
1205 John Smith 2 1196 12/11/2009 10:17:07 PM 12/11/2009 10:17:19 PM
1207 John Smith 3 1196 12/11/2009 10:17:19 PM 12/11/2009 10:20:00 PM
1209 John Smith 4 1196 12/11/2009 10:20:00 PM 12/11/2009 10:20:11 PM
Each history record has an Entity_ID that's the ID of the current row, the Date_Start on the new record matches the Date_Modified of the last history row. This allows us to do queries like Where Entity_ID = :id Or ID = :id And :myDate < Date_Modified And :myDate >= Date_Start
. History can be fetched by Entity_ID = :current_id
.
Is there a better approach, hopefully more maintainable/flexible to do this? The concept is simple, when updating a row, copy it to the same table via an insert with the old values, then update the current row...but actually doing that, I have yet to find a simpler way. I'm hoping someone much trickier/wiser in Oracle has a better approach to this. Speed doesn't matter much, we're 99% reads 1% writes like most web applications, and all bulk operations are inserts, not updates which wouldn't create any history.
If anyone has any ideas to simplify the maintenance on this, I'd be extremely appreciative, thanks!
Okay, this is a rewrite. What I missed when I first responded is that the application is storing its History in the main table. Now I understand why @NickCraver is so apologetic about the code.
Well the first thing to do is to hunt down the perpetrators of this design and make sure they never do it again. Storing history like this doesn't scale, makes normal (non-historical) queries more complicated and sabotages relational integrity. Obviously there are scenarios where none of that matters, and perhaps your site is one of them, but in general this is a very bad implementation.
The best way of doing this is Oracle 11g Total Recall. It's an elegant solution, with a completely invisible and effcient implementation, and - by the standards of Oracle's other chargeable extras - quite reasonably priced.
But if Total Recall is out of the question and you really must do it this, don't allow updates. A change to an existing CONTACT record should be an insert. In order to make this work you may need to build a view with an INSTEAD OF trigger. It's still yucky but not quite as yucky as what you have now.
As of Oracle 11.2.0.4 Total Recall has been rebranded Flashback Archive and is included as part of the Enterprise License (although shorn of the compressed journal tables unless we purchase the Advanced Compress option).
This largesse from Oracle ought to make FDA the normal way of storing history: it's efficient, it's performative, it's an Oracle built-in with standard syntax to support historical queries. Alas I expect to see half-cooked implementations with spatchcocked triggers, broken primary keys and horrible performance for many years yet. Because journalling seems to be one of those distractions which developers delight in, despite the fact that it's low-level plumbing which is largely irrelevant to 99.99% of all business operations.
Unfortunately there is no way to avoid referencing all the column names (:OLD.this, :OLD.that, etc.) in triggers. However, what you could do is write a program to generate the trigger code from the table definition (in USER_TAB_COLS). Then whenever the table is changed you can generate and compile a fresh copy of the triggers.
See this AskTom thread for how to do that.
In case someone has the same highly specialized case we do (Linq access making single table history much cleaner/easier, this is what I ended up doing to simplify what we have, welcome any improvements....this is just a script that will run whenever the database changes, regenerating the audit triggers, the main change being PRAGMA AUTONOMOUS_TRANSACTION;
placing the history generating on an autonomous transaction and not caring about mutation (which doesn't matter for how we audit):
Declare
cur_trig varchar(4000);
has_ver number;
Begin
For seq in (Select table_name, sequence_name
From user_tables ut, user_sequences us
Where sequence_name = replace(table_name, '_','') || '_SEQ'
And table_name Not Like '%$%'
And Exists (Select 1
From User_Tab_Columns utc
Where Column_Name = 'ID' And ut.table_name = utc.table_name)
And Exists (Select 1
From User_Tab_Columns utc
Where Column_Name = 'DATE_START' And ut.table_name = utc.table_name)
And Exists (Select 1
From User_Tab_Columns utc
Where Column_Name = 'DATE_MODIFIED' And ut.table_name = utc.table_name))
Loop
--ID Insert Triggers (Autonumber for oracle!)
cur_trig := 'CREATE OR REPLACE TRIGGER ' || seq.table_name || 'CR' || chr(10)
|| 'BEFORE INSERT ON ' || seq.table_name || chr(10)
|| 'FOR EACH ROW' || chr(10)
|| 'BEGIN' || chr(10)
|| ' SELECT ' || seq.sequence_name || '.NEXTVAL INTO :new.ID FROM DUAL;' || chr(10)
|| ' IF(:NEW.ENTITY_ID = 0) THEN' || chr(10)
|| ' SELECT sysdate, sysdate, ''31-DEC-9999'' INTO :NEW.DATE_CREATED, :NEW.DATE_START, :NEW.DATE_MODIFIED FROM DUAL;' || chr(10)
|| ' END IF;' || chr(10)
|| 'END;' || chr(10);
Execute Immediate cur_trig;
--History on update Triggers
cur_trig := 'CREATE OR REPLACE TRIGGER ' || seq.table_name || '_HIST' || chr(10)
|| ' BEFORE UPDATE ON ' || seq.table_name || ' FOR EACH ROW' || chr(10)
|| 'DECLARE' || chr(10)
|| ' PRAGMA AUTONOMOUS_TRANSACTION;' || chr(10)
|| 'BEGIN' || chr(10)
|| ' INSERT INTO ' || seq.table_name || ' (' || chr(10)
|| ' DATE_MODIFIED ' || chr(10)
|| ' ,ENTITY_ID ' || chr(10);
For col in (Select column_name
From user_tab_columns ut
Where table_name = seq.table_name
And column_name NOT In ('ID','DATE_MODIFIED','ENTITY_ID')
Order By column_name)
Loop
cur_trig := cur_trig || ' ,' || col.column_name || chr(10);
End Loop;
cur_trig := cur_trig || ') VALUES ( --ID is Automatic via another trigger' || chr(10)
|| ' SYSDATE --DateModified Set' || chr(10)
|| ' ,:old.ID --EntityID Set' || chr(10);
has_ver := 0;
For col in (Select column_name
From user_tab_columns ut
Where table_name = seq.table_name
And column_name NOT In ('ID','DATE_MODIFIED','ENTITY_ID')
Order By column_name)
Loop
cur_trig := cur_trig || ' ,:old.' || col.column_name || chr(10);
If Upper(col.column_name) = 'VERSION' Then
has_ver := 1;
End If;
End Loop;
cur_trig := cur_trig || ');' || chr(10)
|| ':new.DATE_MODIFIED := ''31-DEC-9999'';' || chr(10)
|| ':new.DATE_START := SYSDATE;' || chr(10);
If has_ver = 1 Then
cur_trig := cur_trig || ':new.version := :old.version + 1;' || chr(10);
End If;
cur_trig := cur_trig || 'COMMIT;' || chr(10)
|| 'END;' || chr(10);
Execute Immediate cur_trig;
End Loop;
End;
/
If you can improve, feel free...I've only written a handful of PL/SQL scripts, the need doesn't arise often...probably a lot left to be desired there.
Answer credit to APC for getting me to look at this a bit harder. I don't recommend this history layout unless it its the rest of your model/application/stack extremely well. For this application, we constantly show a mix of history and current, and filtering is far simpler than combining when it comes to a Linq-to-SQL style access. Thanks for all the answers guys, all good suggestions...and when I have more time and am not crunched by a release schedule, this is something I'll revisit to see if it can be improved further.
I understand your specifc application requirements to have the history and current values in the same table, but perhaps this could be handled by going down the more usual route of having a separate audit table but building it up as a pseudo-materialized view to present a combined view for the application.
For me, this has the advantage of having a simple "current" view and a separate but completely automated "audit" view (which in this case also has the current view).
Something like:
create sequence seq_contact start with 1000 increment by 1 nocache nocycle;
create table contact (
contact_id integer,
first_name varchar2(120 char),
last_name varchar2(120 char),
last_update_date date
);
alter table contact add constraint pk_contact primary key (contact_id);
create table a$contact (
version_id integer,
contact_id integer,
first_name varchar2(120 char),
last_name varchar2(120 char),
last_update_date date
);
alter table a$contact add constraint pk_a$contact primary key
(contact_id, version_id);
create or replace trigger trg_contact
before insert or delete or update on contact
for each row
declare
v_row contact%rowtype;
v_audit a$contact%rowtype;
begin
select seq_contact.nextval into v_audit.version_id from dual;
if not deleting then
:new.last_update_date := sysdate;
end if;
if inserting or updating then
v_audit.contact_id := :new.contact_id;
v_audit.first_name := :new.first_name;
v_audit.last_name := :new.last_name;
v_audit.last_update_date := :new.last_update_date;
elsif deleting then
v_audit.contact_id := :old.contact_id;
v_audit.first_name := :old.first_name;
v_audit.last_name := :old.last_name;
v_audit.last_update_date := sysdate;
end if;
insert into a$contact values v_audit;
end trg_contact;
/
insert into contact (contact_id, first_name, last_name) values
(1,'Nick','Pierpoint');
insert into contact (contact_id, first_name, last_name) values
(2, 'John', 'Coltrane');
insert into contact (contact_id, first_name, last_name) values
(3, 'Sonny', 'Rollins');
insert into contact (contact_id, first_name, last_name) values
(4, 'Kenny', 'Wheeler');
update contact set last_name = 'Cage' where contact_id = 1;
delete from contact where contact_id = 1;
update contact set first_name = 'Zowie' where contact_id in (2,3);
select * from a$contact order by contact_id, version_id;
VERSION_ID CONTACT_ID FIRST_NAME LAST_NAME LAST_UPDATE_DATE
1000 1 Nick Pierpoint 11/02/2010 14:53:49
1004 1 Nick Cage 11/02/2010 14:54:00
1005 1 Nick Cage 11/02/2010 14:54:06
1001 2 John Coltrane 11/02/2010 14:53:50
1006 2 Zowie Coltrane 11/02/2010 14:54:42
1002 3 Sonny Rollins 11/02/2010 14:53:51
1007 3 Zowie Rollins 11/02/2010 14:54:42
1003 4 Kenny Wheeler 11/02/2010 14:53:53
If you want to develop a generic solution, you might want to take a look at DBMS_SQL package. With it you could develop a package/procedure that takes a table name as input and builds the updates based on that, by examining the table structure in the dictionary and building the updates on the fly. It would be non trivial up-front development, but a lot less maintenance in the future, since if a table structure changes, the code would sense that and adapt. This method would work for any table that you care to use it with.
Depending on the complexity of your database (number of tables, size, depth of PK/FK relationships, other logic in triggers), you might want to look at Oracle Workspace Management. You make an API call to put a table under workspace management which results in Oracle replacing the table with an updatable view and other corresponding objects that maintain a history of all versions of the rows.
I've used this and while there are disadvantages, one advantage for auditing is that the code objects are all generated by Oracle and their correctness is generally assumed.
The only time I might recommend that historical records be stored in the same table as the "current" records is when FK links to the records must or might need to link to them. For example, one application I've seen had some FK links that would link to the record as of a "point in time", that is, if the record was updated, the FK would still link to the historical record - this was an important part of the design and separating history records into a second table would have made it more unwieldy.
Apart from that, I'd prefer that a business requirement for tracking all changes should be solved using a separate "history" table for each table. Sure, it means more DDL, but it simplifies the application code enormously and you'll also benefit from better performance and scalability.
精彩评论