开发者

Insert many records using ADO

I am looking the fastest way to insert many records at once (+1000) to an table using ADO.

Options:

  1. using insert commands and parameters

    ADODataSet1.CommandText:='INSERT INTO .....';    
    ADODataSet1.Parameters.CreateParameter('myparam',ftString,pdInput,12,''); 
    ADODataSet1.Open;
    
  2. using TAdoTable

    AdoTable1.Insert;
    AdoTable1.FieldByName('myfield').Value:=myva开发者_开发知识库le;
    //..
    //..
    //..
    AdoTable1.FieldByName('myfieldN').value:=myvalueN;
    AdoTable1.Post;
    
  3. I am using delphi 7, ADO, and ORACLE.


Probably your fastest way would be option 2. Insert all the records and tell the dataset to send it off to the DB. But FieldByName is slow, and you probably shouldn't use it in a big loop like this. If you already have the fields (because they're defined at design time), reference the fields in code their actual names. If not, call FieldByName once for each field and store them in local variables, and reference the fields by these when you're inserting.


Using ADO I think you may be out of luck. Not all back-ends support bulk insert operations and so ADO implements an abstraction to allow consistent coding of apparent bulk operations (batches) irrespective of the back-end support which "under the hood" is merely inserting the "batch" as a huge bunch of parameterised, individual inserts.

The downside of this is that even those back-ends which do support bulk inserts do not always code this into their ADO/OLEDB provider(s) - why bother? (I've seen it mentioned that the Oracle OLEDB provider supports bulk operations and that it is ADO which denies access to it, so it's even possible that the ADO framework simply does not allow a provider to support this functionality more directly in ADO itself - I'm not sure).

But, you mention Oracle, and this back-end definitely does support bulk insert operations via it's native API's.

There is a commercial Delphi component library - ODAC (Oracle Direct Access Components) for, um, direct access to Oracle (it does not even require the Oracle client software to be installed).

This also directly supports the bulk insert capabilities provided by Oracle and is additionally a highly efficient means for accessing your Oracle data stores.


What you are trying to do is called bulk insert. Oracle provides .NET assembly Oracle.DataAccess.dll that you can use for this purpose. There is no hand-made solution that you can think of that would beat the performance of this custom vendor library for the Oracle DBMS.

http://download.oracle.com/docs/html/E10927_01/OracleBulkCopyClass.htm#CHDGJBBJ

http://dotnetslackers.com/articles/ado_net/BulkOperationsUsingOracleDataProviderForNETODPNET.aspx

The most common idea is to use arrays of values for each column and apply them to a template SQL. In the example below employeeIds, firstNames, lastNames and dobs are arrays of the same length with the values to insert.

The Array Binding feature in ODP.NET allows you to insert multiple records in one database call. To use Array Binding, you simply set OracleCommand.ArrayBindCount to the number of records to be inserted, and pass arrays of values as parameters instead of single values:

> 01. string sql =
> 02. "insert into bulk_test (employee_id, first_name, last_name,
> dob) "
> 03.
> + "values (:employee_id, :first_name, :last_name, :dob)";
> 04.
>  
> 05. OracleConnection cnn = new OracleConnection(connectString);
> 06. cnn.Open();
> 07. OracleCommand cmd = cnn.CreateCommand();
> 08. cmd.CommandText = sql;
> 09. cmd.CommandType = CommandType.Text;
> 10. cmd.BindByName = true;
> 11.
>  
> 12. // To use ArrayBinding, we need to set ArrayBindCount
> 13. cmd.ArrayBindCount = numRecords;
> 14.
>  
> 15. // Instead of single values, we pass arrays of values as parameters
> 16. cmd.Parameters.Add(":employee_id", OracleDbType.Int32,
> 17. employeeIds, ParameterDirection.Input);
> 18. cmd.Parameters.Add(":first_name", OracleDbType.Varchar2,
> 19. firstNames, ParameterDirection.Input);
> 20. cmd.Parameters.Add(":last_name", OracleDbType.Varchar2,
> 21. lastNames, ParameterDirection.Input);
> 22. cmd.Parameters.Add(":dob", OracleDbType.Date,
> 23. dobs, ParameterDirection.Input);
> 24. cmd.ExecuteNonQuery();
> 25. cnn.Close();

As you can see, the code does not look that much different from doing a regular single-record insert. However, the performance improvement is quite drastic, depending on the number of records involved. The more records you have to insert, the bigger the performance gain. On my development PC, inserting 1,000 records using Array Binding is 90 times faster than inserting the records one at a time. Yes, you read that right: 90 times faster! Your results will vary, depending on the record size and network speed/bandwidth to the database server.

A bit of investigative work reveals that the SQL is considered to be "executed" multiple times on the server side. The evidence comes from V$SQL (look at the EXECUTIONS column). However, from the .NET point of view, everything was done in one call.


You can really improve the insert performance by using the TADOConnection object directly.

dbConn := TADOConnection......
dbConn.BeginTrans;
try
  dbConn.Execute(command, cmdText, [eoExecuteNoRecords]);
  dbConn.CommitTrans;
except
  on E:Exception do
  begin
    dbConn.RollbackTrans;
    Raise e;
  end;
end;

Also, the speed can be improved further by inserting more than one records at once.


You could also try the BatchOptmistic mode of the TADODataset. I don't have Oracle so no idea whether it is supported for Oracle, but I have used similar for MS SQL Server.

ADODataSet1.CommandText:='select * from  .....';
ADODataSet1.LockType:=ltBatchOptimistic;
ADODataSet1.Open;

ADODataSet1.Insert;
ADODataSet1.FieldByName('myfield').Value:=myvalue1;
//..
ADODataSet1.FieldByName('myfieldN').value:=myvalueN1;
ADODataSet1.Post;

ADODataSet1.Insert;
ADODataSet1.FieldByName('myfield').Value:=myvalue2;
//..
ADODataSet1.FieldByName('myfieldN').value:=myvalueN2;
ADODataSet1.Post;

ADODataSet1.Insert;
ADODataSet1.FieldByName('myfield').Value:=myvalue3;
//..
ADODataSet1.FieldByName('myfieldN').value:=myvalueN3;
ADODataSet1.Post;


// Finally update Oracle with entire dataset in one batch
ADODataSet1.UpdateBatch(arAll);


1000 rows is probably not the point where this approach becomes economic but consider writing the inserts to a flat file & then running the SQL*Loader command line utility. That is seriously the fastest way to bulk upload data into Oracle.

http://www.oracleutilities.com/OSUtil/sqlldr.html

I've seen developers spend literally weeks writing (Delphi) loading routines that performed several orders of magnitude slower than SQL*Loader controlled by a control file that took around an hour to write.


Remenber to disable posible control that are linked to the Dataset/Table/Query/...

...
ADOTable.Disablecontrols;
try
   ...
finally
   ADOTable.enablecontrols;
end;
...


You might try Append instead of Insert:

AdoTable1.Append;
AdoTable1.FieldByName('myfield').Value:=myvale;
//..
//..
//..
AdoTable1.FieldByName('myfieldN').value:=myvalueN;
AdoTable1.Post;

With Append, you will save some effort on the client dataset, as the records will get added to the end rather than inserting records and pushing the remainder down.

Also, you can unhook any data aware controls that might be bound to the dataset or lock them using BeginUpdate.

I get pretty decent performance out of the append method, but if you're expecting bulk speeds, you might want to look at inserting multiple rows in a single query by executing the query itself like this:

AdoQuery1.SQL.Text = 'INSERT INTO myTable (myField1, myField2) VALUES (1, 2), (3, 4)';
AdoQuery1.ExecSQL;

You should get some benefits from the database engine when inserting multiple records at once.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜