开发者

SqlBulkCopy and Entity Framework

My current project consists of 3 standard layers: data, business, and presentation. I would like to use data entities for all my data access needs. Part of the functionality of the app will that it will need to copy all data within a flat file into a database. The file is not so big so I can use SqlBulkCopy. I have found several articles regarding the 开发者_StackOverflow社区usage of SqlBulkCopy class in .NET. However, all the articles are using DataTables to move data back and forth.

Is there a way to use data entities along with SqlBulkCopy or will I have to use DataTables?


You'll need to convert the Entities to a IDataReader or DataTable.

There is a small helper class designed to assist with with: http://archive.msdn.microsoft.com/LinqEntityDataReader/Release/ProjectReleases.aspx?ReleaseId=389

EDIT: msdn link is broken, alt copy can be found here: https://github.com/matthewschrager/Repository/blob/master/Repository.EntityFramework/EntityDataReader.cs

Then you can use SqlBulkCopy like so:

var sbCopy= new SqlBulkCopy(connectionString);
sbCopy.DestinationTableName = "TableName";
sbCopy.WriteToServer(entitiesList.AsDataReader()); 


We've tried and tested a couple of approaches when bulk inserting using EF and eventually went with table-valued parameters for getting the best performance at a range of row sizes. I don't have the numbers to hand but I know this Performance of bcp/BULK INSERT vs. Table-Valued Parameters was a guiding factor.

We originally used SqlBulkCopy coupled with an adapter that took an IEnumerable<T> and created an IDataReader. It also generated the relevant metadata for SqlBulkCopy. Advantage here was that the import is a code only thing. The code that @davehogan posted was used as a basis for this.

Table-valued parameters require a stored procedure and a table-type defined in the database. If you're using code-first you can execute SQL to create these as part of your creation script. Whilst this is more work we found that we got a significantly more consistent and faster throughput of rows into the database.

Also, it's worth considering not bulk inserting into your main table. We use a temp heap table and add a clustered index to it once the data is imported. We then perform a MERGE between the temp table and the main table. This has the benefit of not locking the main table's index while inserting and improves concurrency. We tend to get upwards of 2500 rows/sec per CPU inserted using this method.

Let me know if you want more info.


You may use Bulk package library. Bulk Insert 1.0.0 version is used in projects having Entity framework >=6.0.0 . More description can be found in below link- Bulkoperation source code


For EFCore here are BulkExtensions (Insert, InsertOrUpdate Update, Delete):
Link: https://github.com/borisdj/EFCore.BulkExtensions
Can also be installed via Nuget


SqlBulkCopy uses an IDataReader when calling the WriteToServer method so you should be able to implement IDataReader based on a collection that is IEnumerable. This would allow you to take in an entity set and call SqlBulkCopy using your IDataReader implementation.


You can consider a Dataset to be a serialisation of the data entity. However generally speaking I think SqlBulkCopy is a table to table thing, hence the reason For datatables.


SqlBulkCopy is a direct, almost byte-array-like transfer of row data from client to SQL Server. It is easily the most efficient way to get data into SQL Server.

Its performance lies in truly "bulk" operations, however. Hundreds or thousands of rows isn't necessarily high enough to justify use. Tens of thousands to millions of rows are were SqlBulkCopy's performance will truly shine. And, in the end, all we're really talking about is getting data to the server.

There are other significant challenges in getting a set of rows into a production database's table. Reindexing, reordering (if there is a clustered index), foreign key validation, all these kinds of things add time to your insert and are potentially table- and index-locking.

Also, TVP data is written to disk (as temp table data), and then is accessible to put into your tables. SqlBulkCopy is capable of going directly at your table... performance in that case is significantly faster, however, one must balance speed for concurrency.

I think the overall rule is, if you have a handful of rows to deal with, think TVPs, and if you have many thousands of rows, consider getting it to SQL Server as quickly as possibly via SqlBulkCopy.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜