开发者

fastest way to update varchar column with text

I have the following in a column:

MetaDataServe
-------------
MindWorks.Accounts
MindWorks.Transactions
MindWorks.Commissions
...

I need to update those columns to be in the form of:

MindWorks.Client.Accounts

I initially thought of using the string functions to update them (LEFT, SUBSTR etc), but is there an alternative or better method of updating text in a column to insert text?

I am using SQL Server 2008 an开发者_JAVA技巧d can't use CLR integration.


This is a question about "fastest", so timings are provided below

Test setup, table with >1 million rows

create table MetaDataServe (id int identity primary key, vc varchar(max));

insert MetaDataServe values
('MindWorks.Accounts'),
('MindWorks.Transactions'),
('MindWorks.Commissions');

insert MetaDataServe
select vc
from MetaDataServe, master..spt_values a, master..spt_values b
where b.number between 1 and 30
-- (1090110 row(s) affected)

Stuff vs Replace vs SubString

Performance Summary - STUFF > SUBSTRING > REPLACE

update MetaDataServe set vc = STUFF(vc, 9, 0, '.Client')

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 3 ms.
Table 'MetaDataServe'. Scan count 1, logical reads 55402, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 10094 ms, elapsed time = 10808 ms.
CPU time = 10250 ms, elapsed time = 10896 ms.

(the 2 times are from multiple executions to show the variability, it is quite low so the times can be considered accurate to within 3%)

update MetaDataServe set vc = REPLACE(vc, '.', '.Client.')

SQL Server parse and compile time:
CPU time = 3 ms, elapsed time = 3 ms.
Table 'MetaDataServe'. Scan count 1, logical reads 55402, physical reads 0, read-ahead reads 159, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 20469 ms, elapsed time = 21238 ms.

update MetaDataServe set vc = 'MindWorks.Client.' + SUBSTRING(vc, 11, 100)

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'MetaDataServe'. Scan count 1, logical reads 55402, physical reads 0, read-ahead reads 3, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 11219 ms, elapsed time = 12030 ms.
CPU time = 11531 ms, elapsed time = 12148 ms.

Fixed number vs PATINDEX vs CHARINDEX

(The fixed position version is already given above)
Performance Summary - FIXED > (PATINDEX = CHARINDEX)
There appears to be no material difference between PATINDEX and CHARINDEX

update MetaDataServe set vc = STUFF(vc, PATINDEX('%.%',vc), 0, '.Client')

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 2 ms.
Table 'MetaDataServe'. Scan count 1, logical reads 55400, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 15218 ms, elapsed time = 16167 ms.

update MetaDataServe set vc = STUFF(vc, CHARINDEX('.',vc), 0, '.Client')

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 3 ms.
Table 'MetaDataServe'. Scan count 1, logical reads 55402, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 15469 ms, elapsed time = 16353 ms.

Notes:

  • All update statements given above will work (with a tweak or two) depending on your needs
  • Before each test, the entire table is dropped and recreated to prevent caching issues

CAUTION !

Even though STUFF is faster, you can get into tricky situations. If your data contains

"MindWorksNoDot"

And you update using

update MetaDataServe set vc = STUFF(vc, CHARINDEX('.',vc), 0, '.Client')

You end up with NULL! Because when CHARINDEX cannot find the dot, the second parameter to STUFF of zero (0) causes the entire string to go to NULL.

FINAL WORDS

For safety and reliability, given it is only 33% slower than the STUFF approach, I would simply use a REPLACE statement, i.e.

update MetaDataServe set vc = REPLACE(vc, '.', '.Client.')


Using the STUFF command:

WITH sample AS (
  SELECT 'MindWorks.Accounts' AS col
  UNION ALL
  SELECT 'MindWorks.Transactions'
  UNION ALL
  SELECT 'MindWorks.Commissions')
SELECT s.col,
       STUFF(s.col, CHARINDEX('.', s.col), 1, '.Client.') AS col2
  FROM sample s

Using the REPLACE command:

WITH sample AS (
  SELECT 'MindWorks.Accounts' AS col
  UNION ALL
  SELECT 'MindWorks.Transactions'
  UNION ALL
  SELECT 'MindWorks.Commissions')
SELECT s.col,
       REPLACE(s.col, '.', '.Client.') AS col2
  FROM sample s

Output:

col                      col2
--------------------------------------------------------
MindWorks.Accounts       MindWorks.Client.Accounts
MindWorks.Transactions   MindWorks.Client.Transactions
MindWorks.Commissions    MindWorks.Client.Commissions

Conclusion

Of the two, STUFF is likely to be more flexible. Just depends on your needs.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜