Multiple rows with a single INSERT in SQLServer 2008
I am testing the speed of inserting multiple rows with a single INSERT statement.
For example: INSERT INTO [MyTable] VALUES (5, 'dog'), (6, 'cat'), (3, 'fish)
This is very fast until I pass 50 rows on a single statement, then the speed drops significantly.
开发者_运维技巧Inserting 10000 rows with batches of 50 take 0.9 seconds. Inserting 10000 rows with batches of 51 take 5.7 seconds.
My question has two parts:
- Why is there such a hard performance drop at 50?
- Can I rely on this behavior and code my application to never send batches larger than 50?
My tests were done in c++ and ADO.
Edit: It appears the drop off point is not 50 rows, but 1000 columns. I get similar results with 50 rows of 20 columns or 100 rows of 10 columns.
It could also be related to the size of the row. The table you use as an example seems to have only 2 columns. What if it has 25 columns? Is the performance drop off also at 50 rows?
Did you also compare with the "union all" approach shown here? http://blog.sqlauthority.com/2007/06/08/sql-server-insert-multiple-records-using-one-insert-statement-use-of-union-all/
I suspect there's an internal cache/index that is used up to 50 rows (it's a nice round decimal number). After 50 rows it falls back on a less efficient general case insertion algorithm that can handle arbitrary amounts of inputs without using excessive memory.
the slowdown is probably the parsing of the string values: VALUES (5, 'dog'), (6, 'cat'), (3, 'fish)
and not an INSERT issue.
try something like this, which will insert one row for each row returned by the query:
INSERT INTO YourTable1
(col1, col2)
SELECT
Value1, Value2
FROM YourTable2
WHERE ...--rows will be more than 50
and see what happens
If you are using SQL 2008, then you can use table value parameters and just do a single insert statement.
personally, I've never seen the slowdown at 50 inserts records even with regular batches. Regardless we moved to table value parameters which had a significant speed increase for us.
Random thoughts:
- is it completely consistent when run repeatedly?
- are you checking for duplicates in the 1st 10k rows for the 2nd 10k insert?
- did you try batch size of 51 first?
- did you empty the table between tests?
For high-volume and high-frequency inserts, consider using Bulk Inserts to load your data. Not the simplest thing int he world to implement and it brings with it a new set of challenges, but it can be much faster than doing an INSERT.
精彩评论