User Defined Function Best Practice

2023-01-16 02:14 问答作者：

I am contemplating using some user defined function calls within some of my queries instead of using a bunch o开发者_运维问答f inline case statements. The inline statements will probably perform better, but the functions make it so much easier to view and possibly maintain.

I just wanted to get an idea of what the typical best practice is for UDFs? I realize that using them in the criteria (Where Clause) can have some significant performance impacts.

Especially in those cases where you can have lots of when statements in your case block or even nested case statements.

Thanks,

My canned answer:

There is a popular misconception that UDFs have adverse effect on performance. As a blanket statement, this is simply not true. In fact, inline table-valued UDFs are actually macros – the optimizer is very well capable rewriting queries involving them as well as optimizing them. However, scalar UDFs are usually very slow. I will provide a short example.

Prerequisites

Here is the script to create and populate the tables:

CREATE TABLE States(Code CHAR(2), [Name] VARCHAR(40), CONSTRAINT PK_States PRIMARY KEY(Code))

GO

INSERT States(Code, [Name]) VALUES('IL', 'Illinois')

INSERT States(Code, [Name]) VALUES('WI', 'Wisconsin')

INSERT States(Code, [Name]) VALUES('IA', 'Iowa')

INSERT States(Code, [Name]) VALUES('IN', 'Indiana')

INSERT States(Code, [Name]) VALUES('MI', 'Michigan')

GO

CREATE TABLE Observations(ID INT NOT NULL, StateCode CHAR(2), CONSTRAINT PK_Observations PRIMARY KEY(ID))

GO

SET NOCOUNT ON

DECLARE @i INT

SET @i=0

WHILE @i<100000 BEGIN

  SET @i = @i + 1

  INSERT Observations(ID, StateCode)

  SELECT @i, CASE WHEN @i % 5 = 0 THEN 'IL'

    WHEN @i % 5 = 1 THEN 'IA'

    WHEN @i % 5 = 2 THEN 'WI'

    WHEN @i % 5 = 3 THEN 'IA'

    WHEN @i % 5 = 4 THEN 'MI'

    END

END

GO

When a query involving a UDF is rewritten as an outer join.

Consider the following query:

SELECT o.ID, s.[name] AS StateName

  INTO dbo.ObservationsWithStateNames_Join

  FROM dbo.Observations o LEFT OUTER JOIN dbo.States s ON o.StateCode = s.Code

/*

SQL Server parse and compile time:

   CPU time = 0 ms, elapsed time = 1 ms.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Observations'. Scan count 1, logical reads 188, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'States'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.



SQL Server Execution Times:

   CPU time = 187 ms,  elapsed time = 188 ms.

*/

And compare it to a query involving an inline table valued UDF:

CREATE FUNCTION dbo.GetStateName_Inline(@StateCode CHAR(2))

RETURNS TABLE

AS

RETURN(SELECT [Name] FROM dbo.States WHERE Code = @StateCode);

GO

SELECT ID, (SELECT [name] FROM dbo.GetStateName_Inline(StateCode)) AS StateName

  INTO dbo.ObservationsWithStateNames_Inline

  FROM dbo.Observations

Both its execution plan and its execution costs are the same – the optimizer has rewritten it as an outer join. Don’t underestimate the power of the optimizer!

A query involving a scalar UDF is much slower.

Here is a scalar UDF:

CREATE FUNCTION dbo.GetStateName(@StateCode CHAR(2))

RETURNS VARCHAR(40)

AS

BEGIN

  DECLARE @ret VARCHAR(40)

  SET @ret = (SELECT [Name] FROM dbo.States WHERE Code = @StateCode)

  RETURN @ret

END

GO

Clearly the query using this UDF provides the same results but it has a different execution plan and it is dramatically slower:

/*

SQL Server parse and compile time:

   CPU time = 0 ms, elapsed time = 3 ms.

Table 'Worktable'. Scan count 1, logical reads 202930, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Observations'. Scan count 1, logical reads 188, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.



SQL Server Execution Times:

   CPU time = 11890 ms,  elapsed time = 38585 ms.

*/

As you have seen, the optimizer can rewrite and optimize queries involving inline table valued UDFs. On the other hand, queries involving scalar UDFs are not rewritten by the optimizer – the execution of the last query includes one function call per row, which is very slow. Copied from here

Don't use functions solely for the sake of aesthetics. That can be handled by using consistent code formatting.

In the situation you describe, you create an external dependency--the function has to exist, and be visible to the user, for the query to run. Until SQL Server supports something identical to Oracle packages (assemblies are not native SQL)...

There's also a risk of falling into the trap of believing that SQL functions execute like methods/functions in procedural/OO programming, which they don't so queries can perform worse with the function rather than without.

I tend to avoid most functions because while they are prettier than in-line case statements they also tend to make the query plan less accurate. If you hide a lot of complexity in the function, then the complexity also tends to be hidden from the query plan, so if you have problems and need to tune the query later, you'll generally be fixing things that show up as high cost but are in reality trivial cost compared to the UDF.

继续阅读：sql sql-server sql-server-2008 tsql

User Defined Function Best Practice

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？