开发者

Enumerator for collection that is time/update aware

I have a situation where I want to be able to access historical information in order to retrospectively generate metrics or understand the way things stood at a point in the past, up until now I have been storing the most-recent content (e.g. the record with all of the updates ever applied) but would now like to be able to turn back the clock.

Before rolling my own solution:

  • Does something else already exist?
  • Is this a standard pattern?
  • Where am I likely to hit problems?

I'm not expecting consumers of the records to be able to change them, so any 'updates' should be marshalled through the repository and it will create new records containing the full content.

Ideally I'd like to move this over to an SQL backend, so if patterns exist there I'd like to remain close to them.


Basic Design thoughts are:

Define an interface, say IUpdatableRecord:

public interface IUpdatableRecord<K>
{
    K Key { get; }
    DateTime Updated { get; }
}

Define a repository with enumeration capability:

public class DataRepository : IEnumerable<IUpdateableRecord<K>>
{
    // Some internal collection that allows duplicate keys
    private IList<IUpdateableRecord<K>> dataStore = ....;

    // Some enumerator overloads
    public IEnumerator<IUpdateableRecord<K>> GetEnumerator()
    {
        return dataStore.GetEnumerator();
    }

    // enumerator for contents as of a specific date-time
    public IEnumerator<IUpdateableRecord<K>> GetEnumerator(DateTime refDate)
    {
        // Group by key (so all versions of a record together)
        var groupedByKey = dataStore.GroupBy(r => r.Key);

        // Sort the keys within each group for a date/time order
        foreach ( var rec in groupedByKey )
        {
            var sorted = rec.OrderBy(r => r.Updated);

            // Ignore updates after the reference date & keep last (or default)
            var last = sorted.Where(r => r.Updated < refDate).LastOrDefault();

            // yield last record if any
            if ( last != null )
            { 
                yield return last;
            }
        }
  开发者_如何学JAVA  }

    // code for 'adding/updating' a record.
}


If you want this solution to utilise an SQL backnd, then you should consider either the ADO.NET Entity Framework or Linq-SQL.

Your main potential problem is your Enumerator, so you's need to look at a number of approaches and check the SQL being generated by Linq (LinqPad is good for this) and make sure its efficient.


There are a couple of solutions you can impelemt in an RDBMS; one is where you store all the historical values for a given key value along with a 'valid-from' and a 'valid-to' date. (This is commonly used in data warehousing solutions). This can complicate your queries quite a lot and is a pain to retrofit to an exisiting solution. One other way I have used is to hold history tables that are replicas of exisiting tables, plus a few extra metadata columns; by forcing all updates through procedures, you can write the pre-change record to the history table. Any existing queries on the core tables are fine because they see the current value, but for historical queries you can UNION the Core and history tables (via a view in my case) to provide the source for your LINQ objects.

In SQL Server, I use a seperate History schema for the history tables so you can re-use the table names.


I have subsequently discovered that this problem has a name, the type of database I wanted is a 'Temporal Database'.

Googling the term has shown some very useful links, including a complete (out of print) book referenced from this discuss.joelonsoftware question that is similar.

The book referenced, is available as a free PDF here: "Developing Time-Oriented Database Applications in SQL"

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜