开发者

'tail -f' a database table

Is it possible to effectively tail a database table such that when a new row is a开发者_如何学JAVAdded an application is immediately notified with the new row? Any database can be used.


Use an ON INSERT trigger.

you will need to check for specifics on how to call external applications with the values contained in the inserted record, or you will write your 'application' as a SQL procedure and have it run inside the database.

it sounds like you will want to brush up on databases in general before you paint yourself into a corner with your command line approaches.


  • Yes, if the database is a flat text file and appends are done at the end.
  • Yes, if the database supports this feature in some other way; check the relevant manual.
  • Otherwise, no. Databases tend to be binary files.


I am not sure but this might work for primitive / flat file databases but as far as i understand (and i could be wrong) the modern database files are encrypted. Hence reading a newly added row would not work with that command.


I would imagine most databases allow for write triggers, and you could have a script that triggers on write that tells you some of what happened. I don't know what information would be available, as it would depend on the individual database.


There are a few options here, some of which others have noted:

  • Periodically poll for new rows. With the way MVCC works though, it's possible to miss a row if there were two INSERTS in mid-transaction when you last queried.
  • Define a trigger function that will do some work for you on each insert. (In Postgres you can call a NOTIFY command that other processes can LISTEN to.) You could combine a trigger with writes to an unpublished_row_ids table to ensure that your tailing process doesn't miss anything. (The tailing process would then delete IDs from the unpublished_row_ids table as it processed them.)
  • Hook into the database's replication functionality, if it provides any. This should have a means of guaranteeing that rows aren't missed.

I've blogged in more detail about how to do all these options with Postgres at http://btubbs.com/streaming-updates-from-postgres.html.


tail on Linux appears to be using inotify to tell when a file changes - it probably uses similar filesystem notifications frameworks on other operating systems. Therefore it does detect file modifications.

That said, tail performs an fstat() call after each detected change and will not output anything unless the size of the file increases. Modern DB systems use random file access and reuse DB pages, so it's very possible that an inserted row will not cause the backing file size to change.

You're better off using inotify (or similar) directly, and even better off if you use DB triggers or whatever mechanism your DBMS offers to watch for DB updates, since not all file updates are necessarily row insertions.


I was just in the middle of posting the same exact response as glowcoder, plus another idea:

The low-tech way to do it is to have a timestamp field, and have a program run a query every n minutes looking for records where the timestamp is greater than that of the last run. The same concept can be done by storing the last key seen if you use a sequence, or even adding a boolean field "processed".


With oracle you can select an psuedo-column called 'rowid' that gives a unique identifier for the row in the table and rowid's are ordinal... new rows get assigned rowids that are greater than any existing rowid's.

So, first select max(rowid) from table_name

I assume that one cause for the raised question is that there are many, many rows in the table... so this first step will be taxing the db a little and take some time.

Then, select * from table_name where rowid > 'whatever_that_rowid_string_was'

you still have to periodically run the query, but it is now just a quick and inexpensive query

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜