开发者

How to use Oracle without transactions?

MySQL has special table type MyISAM that does not support transactions. Does Oracle has something like this? I'd like to create write-only database(for logging) that needs 开发者_如何学Pythonto be very fast(will store a lot of data) and doesnt need transactions.


Transactions are key to SQL database operations. They are certainly fundamental in Oracle. There is no way to write permanently to Oracle tables without issuing a commit, and lo! there is the transaction.

Oracle allows us to specify tables to be NOLOGGING, which doesn't generate redo log. This is only meant to be for bulk loading (using the INSERT /*+ APPEND */ hint), with the advice to switch to LOGGING and take a back as soon as possible. Because data which is not logged is not recoverable. And if you don't want to recover it, why bother writing it in the first place?

An alternative approach is to batch up the writes in memory, and then use bulk inserts to write them. This is pretty fast.

Here is a simple log table and a proof of concept package:

create table log_table
(ts timestamp(6)
 , short_text varchar(128)
 , long_text varchar2(4000)
 )
 /

create or replace package fast_log is
     procedure init;
     procedure flush;
     procedure write (p_short log_table.short_text%type
                      , p_long log_table.long_text%type);
end fast_log;
/

The log records are kept in a PL/SQL collection, which is an in-memory structure with a session scope. The INIT() procedure initialises the buffer. The FLUSH() procedure writes the contents of the buffer to LOG_TABLE. The WRITE() procedure inserts an entry into the buffer, and if the buffer has the requisite number of entries calls FLUSH().

create or replace package body fast_log is

    type log_buffer is table of log_table%rowtype;
    session_log log_buffer;

    write_limit constant pls_integer := 1000;
    write_count pls_integer;

     procedure init
     is
     begin
        session_log := log_buffer();
        session_log.extend(write_limit);
        write_count := 0;
     end init;

     procedure flush
     is
     begin
        dbms_output.put_line('FLUSH::'||to_char(systimestamp,'HH24:MI:SS.FF6')||'::'||to_char(write_count));
        forall i in 1..write_count
            insert into log_table
                values session_log(i);
        init;
     end flush;

     procedure write (p_short log_table.short_text%type
                      , p_long log_table.long_text%type)

     is
        pragma autonomous_transaction;
     begin
        write_count := write_count+1;
        session_log(write_count).ts := systimestamp;
        session_log(write_count).short_text := p_short;
        session_log(write_count).long_text := p_long;

        if write_count = write_limit
        then
            flush;
        end if;

        commit;

     end write;

begin
    init;
end fast_log;
/

The write to log table uses the AUTONOMOUS_TRANSACTION pragma, so the COMMIT occurs without affecting the surrounding transaction which triggered the flush.

The call to DBMS_OUTPUT.PUT_LINE() is there to make it easy to monitor progress. So, let's see how fast it goes....

SQL> begin
  2      fast_log.flush;
  3      for r in 1..3456 loop
  4          fast_log.write('SOME TEXT', 'blah blah blah '||to_char(r));
  5      end loop;
  6      fast_log.flush;
  7  end;
  8  /
FLUSH::12:32:22.640000::0
FLUSH::12:32:22.671000::1000
FLUSH::12:32:22.718000::1000
FLUSH::12:32:22.749000::1000
FLUSH::12:32:22.781000::456

PL/SQL procedure successfully completed.

SQL>

Hmmm, 3456 records in 0.12 seconds, that's not too shabby. The main problem with this approach is the need to flush the buffer to round up loose records; this is a pain e.g. at the end of a session. If something causes the server to crash, unflushed records are lost. The other problem with doing stuff in-memory is that it consumes memory (durrrr), so we cannot make the cache too big.

For the sake of comparison I added a procedure to the package which inserts a single record directly in to LOG_TABLE each time it is called, again using the autonomous transactions:

 procedure write_each (p_short log_table.short_text%type
                  , p_long log_table.long_text%type)

 is
    pragma autonomous_transaction;
 begin
    insert into log_table values ( systimestamp, p_short, p_long );

    commit;

 end write_each;

Here are its timings:

SQL> begin
  2      fast_log.flush;
  3      for r in 1..3456 loop
  4          fast_log.write_each('SOME TEXT', 'blah blah blah '||to_char(r));
  5      end loop;
  6      fast_log.flush;
  7  end;
  8  /
FLUSH::12:32:44.157000::0
FLUSH::12:32:44.610000::0

PL/SQL procedure successfully completed.

SQL>

Wall clock timings are notoriously unreliable but the batched approach is 2-3 times faster than the single record appraoch. Even so, I could execute well over three thousand discrete transactions in less than half a second, on a (far from top-of-the-range) laptop. So, the question is: how much of a bottleneck is logging?


To avoid any misunderstanding:

@JulesLt had posted his answer while I was working on my PoC. Although there are similarities in our views I think the differences in suggested workaround merits posting this.


"What's the timing for write_each without the autonomous but a single commit at the end? My timings suggest it is not significant - that bulking the insert is the big win"

My timings suggest something slightly different. Replacing a COMMIT per write with a single COMMIT at the end roughly halves the elapsed time. Still slower than the bulked approach, but not by nearly as much.

The key thing here is benchmarking. My proof of concept is running about six times faster than Jules's test (my table has one index). There are all sorts of reasons why this might be - machine spec, database version (I'm using Oracle 11gR1), table structure, etc. In other words, YMMV.

So the teaching is: first decide what the right thing to do for your application, then benchmark that for your environment. Only consider a different approach if your benchmark suggests a serious performance problem. Knuth's warning about premature optimization applies.


The closest may be creating a NOLOGGING tablespace, and using the NOLOGGING option of creating the table within it - although this may only applies for bulk operations (i.e. INSERT /*+ APPEND */ hint required).

That removes the REDO, at the cost of a loss of integrity and data if the DB goes down.

I don't know that it would actually be 'faster' and you should also consider concurrency (if you have many processes trying to write to the same table, you may be better off using transactions that write pending updates into the redo logs than trying to all update the 'real' table).

I've not really investigated NOLOGGING though - I've rarely hit a point where the application bottleneck has been INSERT speed - when I have, it has been the cost of updating indexes rather than the table that has been the issue.

I've just done a quick test and on my quite underpowered development DB (with REDO enabled). Using an autonomous transaction for each row - so each row starts a new transaction and ends with a commit, I can write/commit over 1000 rows to an indexed log table in 1 second vs about .875 seconds doing 1000 inserts without commit.

Doing an insert of 1000 rows in a single hit using a bulk operation is a small fraction of a second - so if you can possibly bulk the logs up, do it.

Some other thoughts : Would an external table do the job - i.e. write to a log file which you then mount as an external table in Oracle when/if you need to read from it?


My experience is that logging is best done to a flat file. My view is that logs are generally not particularly important - UNTIL something goes wrong, at which time they become critical. Because of this I don't want transactional control of my logging. If I need to roll back a transaction because there's a problem I really don't want the logging data rolled back because that's what I'm going to use to help identify what the problem was. In addition, how do you log that there's a problem connecting to the database if the log is stored in the database that you can't connect to?

Share and enjoy.


"that needs to be very fast"

There is a trade-off (sometimes) between fast and recoverable.

In Oracle recoverability is achieved by the redo log file. Evey time you commit, the database 'log writer' does a synchronous call to write outstanding changes to the file. By synchronous, I mean it waits for the file system to confirm that the write has been successful before saying that the commit has been successful.

If you are doing lots of logging (especially from lots of sessions at once) with each line in the log file being committed independently (ag autonomous transaction) then this could well be a bottleneck.

If you don't need that level of recoverability (ie you can afford to lose the last few rows of your log data from your logs in the event of a major failure), look at the NOWAIT option of commit.

If you can't afford to lose anything, then your best bet is REALLY fast storage (which may be a battery backed cache).


What I would do for a similar case, would be to write the logs to a file (appending to a file is probably the fastest way to store your logs) and then have a process batch insert those logs into the DB at regular intervals. Unless of course inserting directly in the DB is fast enough ... but you'll have to test ...


This seems like a solution in search of a problem.

Have you benchmarked the performance? Is Oracle fast enough for you as it is? Transaction management is built-in to the way that Oracle works and trying to work around it seems like you are creating work for yourself.

You seem to have identified transaction management as a problem without really knowing if there is a problem. What happens later on when you then have multiple writers on the table? Or readers blocking writers?


PRAGMA AUTONOMOUS_TRANSACTION

This will allow you to log and commit your log without impacting the surrounding transaction(s). Logging is one of the very few acceptable use cases for autonomous transactions. It does what it says, allows you to write a pl/sql function/procedure that can commit its work without impacting the transaction it may or may not be already participating in. It is 'autonomous'.

au·ton·o·mous 1. (of a country or region) Having self-government. 2. Acting independently or having the freedom to do so: "an autonomous committee of the school board".

Oracle docs:

The AUTONOMOUS_TRANSACTION pragma changes the way a subprogram works within a transaction. A subprogram marked with this pragma can do SQL operations and commit or roll back those operations, without committing or rolling back the data in the main transaction.

CREATE OR REPLACE FUNCTION FNC_LOG(p_log_text varchar2(4000))
 RETURN NUMBER
 IS
PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
  -- Your brief code goes here (don't abuse the evil feature that is autonomous transactions).
END;


Another option if you need extremely high performance is to consider Oracle's TimesTen In-Memory database: http://www.oracle.com/technology/products/timesten/index.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜