How can I efficiently compare my data with a remote database?
I need to update my contacts database in SQL Server with changes made in a remote database (also SQL Server, on a different server on the same local network). I can't make any changes to the remote database, which is a commercial product. I'm connected to the remote database using a linked server. Both tables contain around 200K rows.
My logic at this point is very simple: [simplified pseudo-SQL follows]
/* Get IDs of new contacts into local temp table */
Select remote.ID into #NewContactIDs
From Remote.Contacts remote
Left Join Local.Contacts local on remote.ID=local.ID
Where local.ID is null
/* Get IDs of changed contacts */
Select remote.ID into #ChangedContactIDs
From Remote.Cont开发者_StackOverflow中文版acts remote
Join Local.Contacts local on remote.ID=local.ID
Where local.ModifyDate < remote.ModifyDate
/* Pull down all new or changed contacts */
Select ID, FirstName, LastName, Email, ...
Into #NewOrChangedContacts
From Remote.Contacts remote
Where remote.ID in (
Select ID from #NewContactIDs
union
Select ID from #ChangedContactIDs
)
Of course, doing those joins and comparisons over the wire is killing me. I'm sure there's a better way - advice?
Consider maintaining a lastCompareTimestamp (the last time you did the compare) in your local system. Grab all the remote records with ModifyDates > lastCmpareTimestamp and throw them in a local temp table. Work with them locally from there.
The last compare date is a great idea
One other method I have had great success with is SSIS (though it has a learning curve, and might be overkill unless you do this type of thing a lot):
Make a package
Set a data source for each of the two tables. If you expect a lot of change pull the whole tables, if you expect only incremental changes then filter by mod date. Make sure the results are ordered
Funnel both sets into a Full Outer Join
Split the results of the join into three buckets: unchanged, changed, new
Discard the unchanged records, send the new records to an insert destination, and send the changed records to either a staging table for a SQL-based update, or - for few rows - an OLEDB command with a parameterized update statement.
OR, if on SQL Server 2008, use Merge
精彩评论