开发者

How To Avoid Locking Database In Entity Framework 4 When Doing Many Updates

This question is in regards to the best practice for handling many inserts or updates using Microsoft Entity Framework. The problem is that we wrote a long-running program which pulls back thousands of records from the database, and then updates a single field on each of those records, one-by-one. Much to our dismay, we realized that each of these records that were updated were locked for the duration of the time in which the ObjectContext was not disposed. Below is some pseudocode (doesn't actually run) to illustrate:

using(ObjectContext context = new ObjectContext())
{

    var myRecords = context.CreateObjectSet<MyType>().AsQueryable();

    foreach(var record in myRecords)
    {
       record.MyField = "updated!";
       context.SaveChanges();

       //--do something really slow like call an external web service
   }
}

The problem is that we need to do many updates without any regard for transactions. We were surprised to realize that calling context.SaveChanges() actually creates the lock on the records and does not release it until the ObjectContext is disposed. We especially do NOT want to lock the records in the database as this is a high-traffic system and the program could potentially run for hours.

So the question is: what is the optimal way to do many updates in Microsoft Entity Fra开发者_运维百科mework 4 WITHOUT doing them all on one long transaction that locks the DB? We are hoping that the answer is not to create a new ObjectContext for every single update...


Entity framework on top of SQL server by default uses read committed transaction isolation level and transaction is committed at the end of SaveChanges. If you suspect other behavior it must be by the rest of your code (are you using TransactionScope? - you didn't show it in your code) or it must be some bug.

Also your approach is wrong. If you want to save each record separately you should also load each record separately. EF is definitely bad choice for this type of applications. Even if you use only single SaveChange for updating all your records it will still make single roundtrip to database for each update.


Those locks are not created by Entity Framework. EF only supports optimistic concurrency, pessimistic locking is not supported with EF.

I think the locking you experience is a result of your SQL Server configuration. Perhaps if your Transaction Isolation Level on the server is set to REPEATABLE READ this might cause the locks after each query. But I am not sure which configuration setting could be exactly the problem. More details are here.

Edit:

Another helpful article about transactions and transaction isolation in EF is here. It strongly recommends to always set the isolation level explicitely. Quote from the article:

If you don't take control of [the isolation level], you have no idea in which transaction isolation level your queries will be running. After all, you don't know where the connection that you got from the pool has been [...] You simply inherit the last used isolation level on the connection, so you have no idea which type of locks are taken (or worse: ignored) by your queries and for how long these locks will be held. On a busy database, this will definitely lead to random errors, time-outs and deadlocks.


I may be wrong, but I believe you should not be calling SaveChanges() every single time since that applies the changes to the database at that point. Instead, apply SaveChanges() at the end of your object changes, or use a counter to do it less frequently.


In our application we had a similar scenario, avoid locking as much as possible running a massive select and then creating a lot of inserts after some in memory operation.

  1. If you want to read everything upfront

Solution A) Use a transaction scope that includes read and update PRO: Data safely updated CONS: Locks caused by read (repeatable reads) and update

Solution B) Do not use a transaction and update all the data together PRO: Data safely updated, but the data you read may have been changed in the meanwhile CONS: Locks caused by the update for the entire duration (EF create by default a transaction)

Solution C) Update in batches instead of all the data all together (usable only if the select is not locking the tables, otherwise you get the same behaviour as B PRO: Shorter and smaller locks in the updated tables CONS: You increase the change of being affected by data obsolescence

  1. If you want (and can) read in batches

Solution D) Breaking down the problem and splitting the reads can facilitate you to reduce lock so you can use a transaction scope to wrap both read and write (as sol. A) PRO: Data safely updated CONS: Locks caused by read (repeatable reads) and update, the impacts vary based on the batch size and nature of the query itself

Solution E) Do not use transactions, so only the update will produce small locks (as sol. B) PRO: Data safely updated, but the data you read may have been changed in the meanwhile CONS: Locks caused by the updates

As @Ladislav correctly pointed, multiple inserts are really inefficient and a quick profiling on the database shows you how the ORM magic fails in this case. If you want to use EF to perform batch operations such inserts, update and deletes, I recommend you to have a look at this: EF Utilities

I tend to test locks using this query, I hope may help to understand better what is going on.

SELECT
    OBJECT_NAME(p.OBJECT_ID) AS TableName,
    resource_type,
    resource_description
FROM
    sys.dm_tran_locks l JOIN 
    sys.partitions p ON 
    l.resource_associated_entity_id = p.hobt_id
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜