开发者

How does SQL Server handle SELECT statements in a transactional INSERT?

As part of a retail closing process, there is a transactional stored procedure that selects from each of 18 tables and inserts them into a separate database for later mainframe processing. This procedure is showing some strange timing behavior, and I think it is because of a fundamental misunderstanding of the way transactions work in SQL Server.

I recognize that this isn't the best architecture for this problem, and the new solution is being developed, but in the meantime, I need to improve this process.

The stored procedure is running based on user request, and looks something like this:

BEGIN TRANSACTION

INSERT INTO Table1
        (Column1,
        Column2,
        Column3,
        Column4,
        Column5,
        Column6,
        Column7,
        Column8)
    SELECT
        Column1,
        Column2,
        Column3,
        Column4,
        Column5,
        Column6,
        Column7,
        Column8
    FROM
        OLTPTable T
    INNER JOIN
        LookupTable1 L
    ON  
        T.Foreign = L.Key

INSERT INTO Table2
        (Column1,
        Column2,
        Column3)
    SELECT
        Colu开发者_如何学Pythonmn1,
        Column2,
        Column3
    FROM
        OLTPTable2 T
    INNER JOIN
        LookupTable2 L
    ON  
        T.Foreign = L.Key

INSERT INTO Table3
        (Column1,
        Column2,
        Column3,
        Column4,
        Column5,
        Column6)
    SELECT
        Column1,
        Column2,
        Column3,
        Column4,
        Column5,
        Column6
    FROM
        OLTPTable3 T
    INNER JOIN
        LookupTable3 L
    ON  
        T.Foreign = L.Key

-- Through Table 18 and OLTP Table 18

COMMIT TRANSACTION

The logging looks something like this:

Table1      0.2 seconds 354 rows
Table2      7.4 seconds 35 rows
Table3      3.9 seconds 99 rows

There isn't a clear correlation between quantity of rows or complexity of joins and time.

My question is - on a long procedure like this, what is the effect of the transaction? Does is lock all the tables in the subselects at the beginning? One at a time? Is it waiting for the source table to be available for a lock, which is causing the waits?


With default READ COMMITTED isolation level, Read (shared) locks will be there for the duration of each SELECT only. Not for the transaction.

To change this, you'd need REPEATABLE_READ on higher to persist the shared (read) lock until the end of the transaction.

Notes:

  • the lock granularity (row, page etc) is separate to this duration
  • other processes will be able to read the SELECTed tables

Your INSERT durations will be affected by a whole raft of conditions. Some of the,:

  • indexes for SELECT
  • indexes to be maintained as part of the INSERT
  • triggers on the target table
  • other writing and reading processes
  • transaction log file writes
  • ...and all the stuff I mentioned here: Why does an UPDATE take much longer than a SELECT?

Edit:

After more thought, would you be better with sp_getapplock etc to maintain app-level concurrency.

  • How to Prevent Sql Server Jobs to Run simultaneously
  • TSQL mutual exclusive access in a stored procedure


Under what isolation level?

For the write part (the insert) everything is the same no matter the isolation level: all inserted rows are locked in X mode until the end of the end of the transaction. Row locks may escalate to table locks.

If the isolation level is left at the default read committed and read-committed-snapshot-isolation option on the database is OFF the things will happen like this: each select will lock one row at a time in S mode and will release it immediately.

Under repeatable read isolation the S locks acquired by the SELECTs will be retained until the end of the transaction, and they may escalate to table S locks.

Under serializable read isolation the SELECTs will acquire range locks instead of row locks and will retain them till until the end of the transaction. Again, lock escalation may occur.

Under snapshot isolation the SELECTs acquire no locks at all (ignoring some technicalities around schema stability locks), they will read any locked row from the versions tore instead. The version read corresponds to the snapshot of the value at the beginning of the transaction.

Under read committed isolation when read-committed-snapshot-isolation is enabled on the database the SELECTs will not acquire any locks, they will read from the version store instead if a row is locked. The version read corresponds to the snapshot of the value at the beginning of the statement.

Now back to your question, why do you see differences in performance? As with any performance question, it is best you apply a investigation methodology, and Waits and Queues is an excellent one. Once you investigate the root cause, a proper solution can be proposed.


In my opinion, using or not using transactions does not affect performance too much in this case. I suggest that you need to tune each insert individually (take into account that performance of your inserts depends not only on sub-selects, but also on inserts- the more indexes in target tables, the worse insert performance; you might also have chosen wrong clustered indexes for some tables).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜