Entity Framework Algorithm For Combining Data

2023-02-28 18:02 问答作者：

This pertains to a project I am inheriting and cannot change table structure or data access model. I have been aske开发者_如何学God to optimize the algorithm being used to insert data into the database.

We have a dataset in table T. From that, we pull a set we will call A. We also query an XML feed and get a set we will call X.

If a value from X is in A, record in A must be updated to reflect data for X.record
If a value from X is not in A, X.record must be inserted into A
If a value from A is not in X, A.record must be preserved in A
X must be fully iterated through for all records, and A must be updated

All these changes need to be insert back into the database.

The algo as set up does the following:

Query XML into a LIST
foreach over the XML LIST
    look up foreach.item in A via LINQ (i.e. query = from record in A where
                                                         record.GUID == foreach.item.GUID
                                                         select record)
    if query.Count() == 0
        insert into A (via context.AddToTableName(newTableNameObject)
    else
        var currentRecord = query.First()
        set all properties on currentRecord = properties from foreach.item
        context.SaveChanges()

I know this to be suboptimal. I tried to get A into a object (call it queryA) outside of the foreach loop in an effort to move the query to memory and not hitting the disk, but after thinking that through, I realized the database was already in memory.

Having added timer objects into the algo, it's clear that what is costing the most time is the SaveChanges() function call. In some cases it's 20ms, and in some others, in explicably, it will jump to 100ms.

I would prefer to only call the SaveChanges() one time. I can't figure out how to do that given my depth of knowledge of EF (which is thin at best) and the constraints of not being able to change the table structures and having to preserve data from A which is not in X.

Suggestions?

I don't think that you will improve the performance when using Entity framework:

The query

Loading each record by separate query is not good
You can improve the performance by loading multiple records in the same query. For example you can load small batch of records by using either || in condition or Contains (like IN in SQL). Contains is supported only by .NET 4.0.
Another improvement can be replacing the query with stored procedure and table valued parameter to pass all guids to SQL server join A with X.Guids and get results. Table valued parameters are only supported on SQL 2008 and newer.

The data modification

You ~~don't have to~~ should not call SaveChanges after each modification. You can call it after foreach loop and it will still work. It will pass all modifications in single transaction ~~but you will not get any performance boost by such operation~~ and according to this answer it can give you a significant boost.
EF doesn't support command batching and because of that each update or insert always takes separate round trip to the database. There is no way around this when using EF to modify data except implementing whole new EF ADO.NET provider (it is like starting a new project).
Again solution is reducing roundtrips by using stored procedure with table valued parameter
If your DB also uses that GUID as primary key and clustered index you have another performance decrease by reordering index after each insert = modifying data on disk.

The problem is not in algorithm but in the way you process the data and technology used to process the data. Entity framework is not a good choice for data pumps. You should go with these information to your boss because improving performance means more complicated change in your application. It is not your fault and it is not the fault of the programmer who did the application. This is a feature of EF which is not very well known and as I know it is not clearly stated in any MS best practices.

继续阅读：database entity-framework linq-to-entities sql-server

Entity Framework Algorithm For Combining Data

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？