开发者

Best practice for archiving a huge table of over 1,000,000,000 rows

I'm using SQL Server 2005. There is an audit trail table, containing over 1,000,000,000 rows. I'm planning to archive this table. When I make a simple selec开发者_JAVA百科t with nolock, I can still find blocking (probably IO blocking with other process?). So are there any best practice for this kind of situation?


For a table that large you will be wanting to find some effective sharding/partitioning strategy. Archiving in this sense tends to be a form of partitioning but not a good one since you often want to query over the current and archive anyway. In the worst cases you end up with a SELECT over a UNION of the archive and current tables, which is worse than if you hadn't split them at all.

You will often do better by finding some other means to slice the data, say on a record type or something. But if you are going to split it by date make absolutely sure you won't query over the archive+current data set.

Also, SQL Server 2005+ doesn't by default enable MVCC. It can do this however if you enable what MS calls Snapshot Isolation. See Serializable vs. Snapshot Isolation Level.

The effect of not having this enabled is that an uncommitted INSERT or UPDATE will block a SELECT in another transaction until the first transaction commits or rolls back. That can cause unnecessary locks and limit your scalability.


Create a backup of the database and restore it in the archive location.


Selecting 1 billion rows all at once is going to strain the server no matter how you do it.

Do it in batches instead, say 1000 rows at a time. The bcp tool does this automatically. Or use SSIS to copy the data into another database - it does pretty much the same thing.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜