开发者

Making structural changes to very large tables in an online environment

So here's what I'm facing.

The Problem

  • A large table, with ~230,000,000 rows.
  • We want to change the clustering index and primary key of this table to a simple bigint identity field. There is one other empty field being added to the table, for future use.
  • The existing table has a composite key. For the sake of argument, let's say it's 2 bigint's. The first one may have 1 or 10,000 'children' in the 2nd part of the key.

Requirements

  • Minimal downtime, like preferably the length of time it takes to run SP_Rename.
  • Existing rows may change while we're copying data. The updates must be reflected in the new table.

Ideas

  1. Put a trigger on existing table, to update row in new table if it already exists there.
  2. Iterate through original table, copying data into new table ~10,000 at a time. Maybe 2,000 of the first part of the old key.
  3. When the copy is complete, rename the old table to "ExistingTableOld" and the new one from "NewTable" to "ExistingTable". This should allow stored procs to continue to run without intervention

Are there any glaring omissions in the plan, or best 开发者_C百科practices I'm ignoring?


Difficult problem. Your plan sounds good, but I'm not totally sure you really need to batch the query as long as you run it in a transaction isolation level of READ UNCOMMITTED to stop locks being generated.


My experience making big schema changes is big changes are best done during a maintenance window—at night/over a weekend—when users are booted off the system. Just like running dbcc checkdb with the repair option. Then, when things go south, you have the option to roll back to the full backup that you providentially made right before starting the upgrade.

Item #3 on your list: Renaming the old/new tables. You'll probably want to recompile the stored procedures/views. My experience is that execution plans are bound against the object ids rather than object names.

Consider table dbo.foo: if it is renamed to dbo.foo_old, any stored procedures or user-defined functions won't necessarily error out until the dependent object is recompiled and its execution plan rebound. Cached execution plans continue to work perfectly fine.

sp_recompile is your friend.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜