Where are all the native revisioned databases?
I've read all the SO questions, the Coding Horror articles, and Googled my brains off searching 开发者_JS百科for the best ways to revision control data. They all work and they all have their appropriate implementations based on use cases and so on. What I really want to know is why hasn't a database been written to natively support revisioning on the data-level?
What I am baffled with is that the API is already practically in place with transactions. We start a transaction, change some data, and commit. We are authenticating against the database too so blame is present. My company stores end of month versions of our entire database for accounting purposes which equate to tags. Does this not scream RCS?
Branching is something that databases could benefit from greatly too in regards to schema more than data. Since I really only care about data and this would increase the difficulty of implementation by a massive degree I'll stick to just tags and commits.
Now I know that databases are incredibly time-critical applications so any unnecessary overhead is shunned into oblivion and some databases are epic-level huge and revisions will only exponentiate that size. A per-table, opt-in revision control undoubtedly has a place in small to medium scale environments where there are milliseconds to spare and data history has a degree of importance. I want commits, I want logs, I want reverts, I want diffs, I want blame, I want tags, and I want checkouts. I want MF-ing revision control.
I have a question in there somewhere...
One native solution is Oracle's Flashback Database (aka Total Recall). It is a chargeable extra to the Enterprise Edition, but it is pretty cool. It transparently stores versions of the data for as long as we want to retain it, and supplies syntax to query old versions of the data. It can be enabled on a table-by-table basis.
Essentially Flashback DB is like using triggers to store records in tracking tables, but slick, performant and invisible to normal working.
You could read about temporal databases.
In "Temporal Data & the Relational Model" by Date, Darwen, and Lorentzos, the authors introduce a sixth normal form to account for issues in tracking temporal data.
Richard Snodgrass proposed TSQL2 as an extension to SQL to handle temporal data.
Implementations include:
- Oracle Workspace Manager
- TimeDB
Several DBMSs implement engine-level versioning mechanisms. Unfortunately there is no vendor-independent standard for this so they are all proprietary. Oracle flashback has already been mentioned. Microsoft's Change Data Capture feature in SQL Server is another one.
You forgot I want performance. A DBMS is a pretty low level data storage mechanism, and in systems with billions of rows, performance can be important. Therefore, if you want this sort of auditing system, you can build it yourself using the tools available to you (eg. triggers).
Just as in a filesystem, not all files are appropriate for version control, in a database not all rows would be appropriate for version control either.
精彩评论