开发者

Design for simple database for an event stream

For a personal project, I want to record a stream of unrelated events, to gather simple statistics on later. Each event can have [wildly] different data associated with it.

I'm looking for some advice on how to structure this. As this is a small project, my main goals are:

  • Ease of setup
  • Ease of use (i.e. no strange joins)

Performance isn't critical, and I will be the only user.

To give you an idea of what I'm aiming for, here is my current plan:

Table: Event_Definitions

Columns:

  • ID
  • Name
  • Type
  • Options

Table: Events

Columns:

  • ID
  • Definition_ID
  • Option_Values
  • Notes

So say we have two event definitions, like this:

ID: 0; Name: Pigeon_Released; Type: Time; Options: null

ID: 1; Name: Fed_Pigeon; Type: Fixed_List; Options: bread,crackers,tofu

Then we log some events:

ID: 0; Definition_ID: 1; Option_Values: bread; Notes: null

ID: 1; Definition_ID: 1; Option_Values: tofu; Notes: "he cooed"

ID: 2; Definition_ID: 0; Option_Values: 12:34:56; Notes: "I cooed too"

The option values will be enforced through the program.


After a number of events have been collected, I will be collecting together events of the same type for comparison. I don't expect to be retrieving events with specific values, just events of the same type.


So the question is, again, any advice or comments on this strategy or alternatives? I appreciate how simple and straightforward 开发者_JAVA百科this approach is, but it bothers me that even though the 'values' for an even could be strings, times, index numbers, etc. etc., they all get stored in the same column.


Effectively you've got a "blob" of data that could in principle hold anything, and then a schema definition to say how the blob should be interpreted

it's a date

or

it's one of these values

This is quite reasonable for a personal project. A downside of the "blob" approach is that there can be something of an impedence mismatch when doing queries. If you need to work with the blob contents it's going to be a little cumbersome.

Another comment is that your defintion table might become restrictive, you want to put some more sophisticated, structured data in your blob.

I wonder if using XML as your blob helps. The XML schemas then define the blob contents, maybe you don't need your event definition table at all. Does your database (as some do) have any XML capabilities you could exploit?


An RDBMS is not the best technology solution to use as an event stream, and your requirements for variable attributes does not fit the relational paradigm.

Try using a Message Queue instead.


Okay, if you have the need to store and compare events, that's different. When I hear "event" I assume it's just for real-time notification. So my suggestion of a Message Queue may not be appropriate in this case.

Still, relational databases don't support variable attributes in a table easily. You can try using design patterns like Concrete Table Inheritance or Class Table Inheritance to solve your problem.


Most event-steam apps use XML to define event-objects. Though many do have SQL-like language, they do not use underlying RDBMS. You may want to check out Marco of ruleCore who went through the whole exercise of designing their own CEP app and blogged about it.

Even stream engines like StreamBase and Coral8 require schema for the event-streams at design time, so even they may not fit your requirement for "wildly different data".

So, at the end you may need to use something like:

<event>
    <head>
        <id>12784536</id>
        <type_id>51</type_id>
        <time_stamp>2008-12-11T13:25:57.014Z</time_stamp>
        <source_id>862</source_id>
    </head>
    <body>
        <!-- Event specific data here -->
    </body>
</event>


I have tried a small project with rulecore to develop a system that is a bit like yours. I used a mysql db to store a stream of events and then sent them in batch to rulecore where I had created about 20 rules. The rulecore event format is very simple with named properties which can contain just about anything. I did this too as my first try with sql queries resulted in a complex schema and very long and hard to understand queries. The rulecore rules was much simpler.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜