开发者

Corrupted database table with invalid query: what has it done and will it roll back?

I made the error of running what should have been a quick update against my fact table (200M rows) with this:

update dbo.primary_fact
   set count_of_loan_obligors = o.n              
  from dbo.staging_fact f  
       -- notice that this is not the same table as the one in the update clause
       inner join ##Obligor_Count o
       on (f.time_dimension_id = o.time_dimension_id
           and f.account_dimension_id = o.account_dimension_id)

It should have been:

  from dbo.primary_fact f

A correctly formed update like this (1 day, 87k accounts) typically finishes in a minute or 2. After running for 12 minutes, I wondered what was taking so long and spotted my error.

I cancelled the query in SQL Server Management Studio which I understand will roll back all the awful I caused (can someone confirm?)

开发者_JAVA百科But my bigger question is: What does the incorrectly formed query do?


Update: The cancel action finally completed, an hour and 39 minutes later. DBAs were too slow on the kill -- just as well.

Properly formed update finished in 8 seconds.

Second Update: There were no values set from the original (faulty) update following the successful cancel order in SSMS. I would interpret this to mean that any pending updates were rolled back.


But my bigger question is: What does the incorrectly formed query do?

It will update dbo.primary_fact.count_of_loan_obligors with the same value for all rows. The value will be some value from ##Obligor_Count.n. It is hard to figure out what value that would be.

Here is a little test that basically does what you did:

declare @T1 table (ID int)
declare @T2 table (ID int)

insert into @T1 values (0)
insert into @T1 values (0)

insert into @T2 values (2)
insert into @T2 values (1)

update @T1
set ID = T2.ID
from @T2 as T2

select *
from @T1

Result:

ID
2
2

In this case @T1 is updated with the first row in @T2.


Well, first of all it looks like you could probably just re-run your query with the correct from clause instead of the staging_fact table, and it will overwrite any booboos you made. That's the good news, and the joy of working with fact tables.

Bad news is that from my experience, SSMS doesn't roll anything back unless you actually run it within a transaction, so your data is probably a big bucket of fail right now.

Hope you enjoyed that last 12 minutes of your life, because you're about to have the pleasure of doing it again.


I worry that neither version does anything good, because they are both missing a WHERE clause connecting the table being updated with the source of the new values (f inner join o). I expect the line WHERE primary_fact.time_dimension_id=f.time_dimension_id AND primary_fact.account_dimension_id=f.account_dimension_id got dropped in the copy/paste.

As long as the referenced tables have the columns of those names, the join of f and o will be performed just fine. Then those values will be used to update primary_fact, either with the WHERE clause, or in some way I don't know. The UPDATE/FROM syntax is not standard SQL, but it is widely supported. Maybe SQL SERVER even adds in a default WHERE clause. Postgresql doesn't.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜