Fast inserts; BulkCopy with relational data

2023-02-10 21:07 问答作者：

I have a large amount of constantly incoming data (roughly 10,000 a minute, and growing) that I want to insert into a database as efficiently as possible. At the moment I'm using prepared insert statements, but am thinking of using the SqlBulkCopy class to import the data in larger chunks.

The problem is that I'm not inserting into a single table - elements of the data item are inserted into numerous tables, and their identity columns are used as foreign keys in other rows that are inserted at the same time. I understand that bulk copies aren't meant to allow for more complex inserts like this, but I wonder if it is worth exchanging my identity columns (bigints in this case) for uniqueidentifier columns. This will allow me to do a couple of bulk copies for each table, and since I can determ开发者_开发问答ine the IDs before the insert, I don't need to check for anything like SCOPE_IDENTITY which is preventing me from using bulk copy.

Does this sound like a viable solution, or are there other potential issues I might face? Or, is there another way I can insert data quickly, but retain my use of bigint identity columns?

Thanks.

uniqueidentifier will probably make things worse: page splits and wider. See this

If your load is/can be batched, one options is to:

you load a staging table
load the real tables in one go as a stored procedure
use a uniqueidentifier in the staging table for each batch

We deal with peaks of around 50k rows per second (and increasing this way). We actually use a separate staging database to avoid double transaction log writes)

It sounds like you are planning on exchanging "SQL assigns a [bigint identity() column] surrogate key" with a "data prep routine assings a GUID surrogate key" methodology. In other words, the key will not be assigned within SQL, but from outside SQL. Given your volumes, if the data-generating process can assign surrogate key, I'd definitely go with that.

The question then becomes, must you use GUIDs, or can your data-generation process produce auto-incrementing integers? Creating such a process that works consistantly and infallibly is hard (one reason why you pay $$$ for SQL Server), but the trade-off for smaller and more human-legible keys within the database might be worth it.

继续阅读：insert sql sql-server sqlbulkcopy

Fast inserts; BulkCopy with relational data

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？