Guidelines for using Merge task in SSIS
I have a table with three fields, one an identity field, and I need to add some new records from a source that has the other two fields. I'm using SSIS, and I think I should use the merge tool, because one of the sources is not in the local database. But, I'm confused by the merge tool and the proper process.
I have my one source (an Oracle table), and I get two fields, well_id and well_name, with a sort after, sorting by well_id. I have the destination table (sql server), and I'm also using that as a source. It has three fields: well_key (identity field), well_id, and well_name, and I then have a sort task, sorting on well_id. Both of those are input to my merge task. I was going to output to a temporary table, and then somehow get the new records back into the sql server table.
Oracle Well SQL Well
| |
V V
Sort Source Sort Well
| |
-------> Merge* <-----------
|
V
Temp well table
I suspect this isn't the best way to use this tool, however. What are the proper steps for a merge like th开发者_开发技巧is?
One of my reasons for questioning this method is that my merge has an error, telling me that the "Merge Input 2" must be sorted, but its source is a sort task, so it IS sorted.
Example data
SQL Well (before merge)
well_key well_id well_name
1 123 well k
2 292 well c
3 344 well t
5 439 well d
Oracle Well
well_id well_name
123 well k
292 well c
311 well y
344 well t
439 well d
532 well j
SQL Well (after merge)
well_key well_id well_name
1 123 well k
2 292 well c
3 344 well t
5 439 well d
6 311 well y
7 532 well j
Would it be better to load my Oracle Well to a temporary local file, and then just use a sql insert statment on it?
First of all, I would highly recommend getting rid of the sort transformation before merging. The sort is very expensive blocking asynchronous transformation and can be easily avoided by doing an order by and changing your ole db src advance properties of your ouput columns to issorted = true and changing the sortkeyposition property of that column to 1. Dont forget to add your Order by in your tsql since the ole db source does not automatically do this for you:
Synchronous vs Asynchronous
Oledb Src IsSorted Property
Then, I would test the performance of the pkg using the Merge. Looks as though the fields you are pulling are not that big in size; however, if there is 10's of millions of records, then I would also recommend creating a pkg that lands the data into two staging tables, joining them, and then inserting the data as you have suggested trying above to see if you get better performance using that approach.
Hope this helps.
精彩评论