What is the most efficient way to move data from one sql table to another
Currently I have an SSIS job that runs to move data from one sql 20008 machine to another. The job moves about 2 million records from about 6 tables. That takes about 5-10 minutes depending on the server load and that is fine. Because the data is moved into temp tables so nothing is affected besides the strain on the server.
But my problem becomes when I now want to merge that data with their respective live tables. That can take about 15 minutes during which the tables are emptied and then repopulated. What I am wondering is what is the most efficient way to move that data between the tables.
Currently here is how it goes:
drop tables
rebuild tables with indexes and constraints insert into select to move the data then run any calculations that are needed run following command to rebuild all the indexes after the data has been moved:sp_MSforeachtable @command1="print '?' DBCC DBREINDEX ('?')"
I feel like there should be a better way so that there is minimal down time to the users. One thought I had was to create a second set of tables and then just rename those once they are ready but I am not sure if that is the best way either.
I have also just read about the merge command which might be better as I don't have to drop the tables and repopulate which means all the data would stay available but it is hard to know if the records change without looking at nearly all the column开发者_如何转开发s.
I would appreciate any help.
If you are emptying and repopulating what we often do is create a view that is the same name as the current tables (so no existing code breaks) and create two tables called tablenameA and tablenameB with the same structure and all the data. Point the view to tablenameA. Truncate TableNameB. Drop indexes. Run the process to fill tablenameB and reindex, run the script to point the view to tablenameB. Down time to the user? Milliseconds. Then the next time you switch and truncate and fill TableNameA and then redo the view to TableNameA.
Have a look at table partitioning. I believe your use case is one of the reasons table partitioning exists.
Here's a summary
This is more on point with your question
I should mention that this feature is only available on Enterprise and Developer SKUs
In addition to the above point about the table partitioning, you can then avoid the step to the temp table. Load using a SQL Server Destination, running the package on the destination server, to an empty partition. Use partitioned indexes and rebuild the index only for that empty partition. Merge in the new partition.
After considering the option for the view with different underlying tables I decided against to avoid the complexity and confusion that it might introduce. I looked into partitioning but since I do not have much control on the source machine it didn't seem like the appropriate solution. So in the end I just decided to use the SQL MERGE statement and use BINARY_CHECKSUM to compare rows and determine differences or not. Thhough I am not having some issues with the locking with that. But I have opened another question for that.
How to improve performance of SQL MERGE statement
精彩评论