开发者

Large number of UPDATE queries slowing down page

I am reading and validating large fixed-width text files (range from 10-50K lines) that are submitted via our ASP.net website (coded in VB.Net). I do an initial scan of the file to check for basic issues (line length, etc). Then I import each row into a MS SQL table. Each DB rows basically consists of a record_ID (Primary, auto-incrementing) and about 50 varchar fields.

After the insert is done, I run a validation function on the file that checks each field in each row based on a bunch of criteria (trimmed length, isnumeric, range checks, etc). If it finds an error in any field, it inserts a record into the Errors table, which has an error_ID, the record_ID and an error message. In addition, if the field fails in a particular way, I have to do a "reset" on that field. A reset might consist of blanking the entire field, or simply replacing the value with another value (e.g. replacing the string with a new one that has all illegals chars taken out).

I have a 5,000 line test file. The upload, initial check, and import takes about 5-6 seconds. The detailed error check and insert into the Errors table takes about 5-8 seconds (this file has about 1200 errors in it). However, the "resets" part takes about 40-45 seconds for 750 fields that need to be reset. When I comment out the resets function (returning immediately without actually calling the UPDATE stored proc), the process is very fast. With the resets turned on, the pages take 50 seconds to return.

My UPDATE stored proc is using some recommended code from http://sommarskog.se/开发者_C百科dynamic_sql.html, whereby it uses CASE instead of dynamic SQL:

UPDATE dbo.Records
SET    dbo.Records.file_ID = CASE @field_name WHEN 'file_ID' THEN @field_value ELSE file_ID END,
.
. (all 50 varchar field CASE statements here)
.
WHERE dbo.Records.record_ID = @record_ID

Is there any way I can help my performance here. Can I somehow group all of these UPDATE calls into a single transaction? Should I be reworking the UPDATE query somehow? Or is it just sheer quantity of 750+ UPDATEs and things are just slow (it's a quad proc server with 8GB ram).

Any suggestions appreciated.


Don't do this in sql; fix the data up in code, then do you updates.

If you have sql 2008, then look into table-value parameters. It enables you to pass an entire table as a parameter to a s'proc. From their you just have the one insert/update or merge statement


If your looping through the lines and doing individual updates/inserts this can be really expensive... Consider using SqlBulkCopy which can speed up all your inserts. Similarly, you can create a DataSet, make your updates on the dataset and then submit them all in one shot through a SqlDataAdapter.


I believe you are doing 50 case statements on every update. Sounds like that would be slow.

It is possible to solve this problem with inject proof code via parameterized querys and a string constant table.

Quick and dirty example code.

string [] queryList = { "UPDATE records SET col1 = {val} WHERE ID={key}",
                        "UPDATE records SET col2 = {val} WHERE ID={key}",
                        "UPDATE records SET col3 = {val} WHERE ID={key}",
                         ...
                        "UPDATE records SET col50 = {val} WHERE ID={key}"} 

Then in your call to SQL you just pick the item in the array corresponding to the col you want to update and set the value and key for the parameterized items.

I'm guessing you will see a significant improvement... let me know how it goes.


Um. Why are you inserting numeric data into VARCHAR fields then trying to run numeric checks on it? This is yucky.

Apply correct data typing and constraints to your table, do the INSERT, and see if it failed. SQL Server will happily report errors back to you.


I would try changing the recovery model to simple and look at my indexes. Kimberly Tripp did a session showing a scenario with improved performance using a heap.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜