faster way for moving large amounts data in oracle to external database

2023-03-28 20:20 问答作者：

I have a java program that compares and moves data from a local Oracle database to an external MS SQL data base hosted by an online vendor.

Currently the program connects to the MS SQL database doing a SELECT * FROM myTABLE and putting the results in a two-dimensional array. A temp table is then created in Oracle and the data is inserted into the temp table using a for loop:

for (int ii = 0; ii < arr.length(); ii++){
    query = "insert into myTable_temp values "+values_from_array;
    stmt.executeQuery();
}

A MINUS statement is used find the differences:

query = "select * from ora_table minus select * from myTable_temp";
rs = stmt.executeQuery();

and the result is then inserted back into the MS SQL table using a insert_into_mssql() method:

while (rs.next()){
    query = "insert into myTable values "+ values_from_rs
    insert_into_mssql(query);
}

This works fine except a table I am us开发者_高级运维ing has 220,000 rows. So it takes a long time to load the array, insert into temp table, run the minus statement, then insert everything back into the mssql table.

Is there a more efficient way to do this? or a way to make this faster? (a dblink or any direct connection wont work bc of security restrictions)

Yes, What version of MS SQL Server are you using? Depending on that, you can look into SSIS or other DTS software. It allows you to do excactly what you want very quickly.

I think it is a good idea to create a dblink.

Maybe it may help you.

The new addBatch( ) method of Statement allows you to lump multiple update statements as a unit and execute them at once. Call addBatch( ) after you create the statement, and before execution:

con.setAutoCommit(false); // If some fail, we want to rollback the rest
Statement stmt = con.createStatement(  );

stmt.addBatch(
"INSERT INTO CUSTOMERS VALUES (1, "J Smith", "617 555-1323");
stmt.addBatch(
"INSERT INTO CUSTOMERS VALUES (2, "A Smith", "617 555-1132");
stmt.addBatch(
"INSERT INTO CUSTOMERS VALUES (3, "C Smith", "617 555-1238");
stmt.addBatch(
"INSERT INTO CUSTOMERS VALUES (4, "K Smith", "617 555-7823");

int[] upCounts = stmt.executeBatch(  );
con.commit(  );

Notice that we turn transaction auto-commit off before creating the batch.

I would maybe start by not loading it into an array and just using a similar technique as is used in your writing out. Could also look at parallelizing the inserts.

If I was doing this from scratch, I would bring all the Oracle data to SQL Server and do everything there (copy table from Oracle, find differences and insert into SQL Server, drop the table copied from Oracle) instead of bringing all the SQL Server data to Oracle, dumping just the new records and bringing them over to SQL Server and inserting them.

SQL Server also has a pretty good ETL tool - SQL Server Integration Services, but I haven't had a lot of success using it with things besides SQL Server data sources and destinations.

Taking into account technological limitations (no dblinks and so on) you can try to lower transport overhead. This can be done by maintaining synchronization state and transferring only updated rows.

If I understand correctly, you need only transfer updated and new rows from Oracle database to MS SQL. In very generic variant you need to add 2 columns into oracle_table which holds replication state:

last_update_time - maintained by trigger on insert or update for each row. Trigger set column value to current date and time on each row change and for new row.

last_sync_time - maintained by trigger on insert and changed by sync application.

For new row last_update_time := sysdate and last_sync_time := last_update_time - 1.

On each oracle_table row update last_update_time updated by current date time.

On each synchronization session next sequence performed:

Select all rows from oracle_table where last_sync_time < last_update_time and insert it to some buffer table on Oracle server:

begin
  execute immediate 'truncate table buffer_table';

  insert into buffer_table(...)
  select * from oracle_table where last_sync_time < last_update_time;

end;

Transfer all rows from buffer_table to MS SQL server and update myTable accordingly.

Advance update time for all synchronized rows to new value:

update 
  oracle_table o_table
set 
  last_sync_time = ( select buf_table.last_update_time 
                     from buffer_table buf_table
                     where buf_table.Primary_Key = o_table.Primary_Key
                   )
where
  o_table.Primary_Key in (select Primary_Key from buffer_table)

Note that synchronized rows may be changed during synchronization process and we need timestamp value from buffer_table while updating last_sync_time.

P.S. I supposed that database on MS SQL side is read-only. Two-way synchronization can be done on the same principle with slightly algorithm improvements.

继续阅读：oracle sql-server

faster way for moving large amounts data in oracle to external database

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？