开发者

SQL Server table valued function performance

I have a 5 year data set partitioned into quarterly tables. I also have a master view that joins them all together. When a user needs more than one quarter of data they often use the master view instead of joining multiple quarters together.

My question is, would a table valued function which accepts a date range and returns only the records from the necessary partitions be faster than querying the entire master view?

This is my current view definition:

ALTER VIEW [dbo].[loandetails_test]
AS
SELECT     *
FROM         loandetails05
where year(date) = 2005
UNION
SELECT     *
FROM         loandetails06
where year(date)开发者_JAVA百科 = 2006
UNION
SELECT     *
FROM         loandetails07
where year(date) = 2007
UNION
SELECT     *
FROM         loandetails08
where year(date) = 2008
UNION
SELECT     *
FROM         loandetails1q09
where date >= '1/1/2009' and date < '4/1/2009'
UNION
SELECT     *
FROM         loandetails2q09
where date >= '4/1/2009' and date < '7/1/2009'
UNION
SELECT     *
FROM         loandetails3q09
where date >= '7/1/2009' and date < '10/1/2009'
UNION
SELECT     *
FROM         loandetails4q09
where date >= '10/1/2009' and date < '1/1/2010'
UNION
SELECT     *
FROM         loandetails1q10
where date >= '1/1/2010' and date < '4/1/2010'
UNION
SELECT     *
FROM         loandetails2q10
where date >= '4/1/2010' and date < '7/1/2010'
UNION
SELECT     *
FROM         loandetails3q10
where date >= '7/1/2010' and date < '10/1/2010'
UNION
SELECT     *
FROM         loandetails4q10
where date >= '10/1/2010' and date < '1/1/2011'
union
SELECT     *
FROM         loandetails_CURRENT
where date >= '1/1/2011' and date < '4/1/2011'


GO


The answer should be a solid no.

Partitions are set up with implicit criteria, so if you are doing it by date (quarter), SQL Server already knows which partitions will satisfy the query (assuming the query will have a date filter). Check the execution plan which will confirm a stream-merge between two (or as many as involved) partitions.

I have a case where tables from N databases (yes one per silo) are joined in a master view, like you have. The master view uses a filter for each one, specifically it looks like this

select source=1, col1, col2, col3
from db1.dbo.tbl
union all
select source=2, col1, col2, col3
from db2.dbo.tbl
etc

Any query that asks for where source in (2,3) automatically recognizes that only 2 dbs need to be searched, and the execution plan reveals as much.

If you manually created date-partitioned queries, you can

  1. have an index on the date range, within each table
  2. force the optimizer to recognize the partitioning

Here is a working example (even without indexes). Notice that Q1 and Q4 are not even showing in the plan. Disclosure: SQL Server 2008 R2 Express

select dateadd(d, number, '20100101') TheDate, *
into Q1data
from master..spt_values
where type='p' and number between 1 and 370
and datepart(quarter, dateadd(d, number, '20100101')) =1

select dateadd(d, number, '20100101') TheDate, *
into Q2data
from master..spt_values
where type='p' and number between 1 and 370
and datepart(quarter, dateadd(d, number, '20100101')) =2

select dateadd(d, number, '20100101') TheDate, *
into Q3data
from master..spt_values
where type='p' and number between 1 and 370
and datepart(quarter, dateadd(d, number, '20100101')) =3

select dateadd(d, number, '20100101') TheDate, *
into Q4data
from master..spt_values
where type='p' and number between 1 and 370
and datepart(quarter, dateadd(d, number, '20100101')) =4
GO

create view Ydata
with schemabinding
as
select TheDate, name, number, TYPE, LOW, high, status
from dbo.Q1Data where TheDate >= '20100101' and TheDate < '20100401'
union all
select TheDate, name, number, TYPE, LOW, high, status
from dbo.Q2Data where TheDate >= '20100401' and TheDate < '20100701'
union all
select TheDate, name, number, TYPE, LOW, high, status
from dbo.Q3Data where TheDate >= '20100701' and TheDate < '20101001'
union all
select TheDate, name, number, TYPE, LOW, high, status
from dbo.Q4Data where TheDate >= '20101001' and TheDate < '20110101'
GO

select * from YData where TheDate between '20100601' and '20100831'

Query Plan

SQL Server table valued function performance

Re: Updated question

When dealing with date ranges, NEVER (with few exceptions) use a function on the date column. This requires the function to be run against ALL records in the table before comparing to the other side.

where year(date) = 2005
===> means scan the table, for each row take the year, compare to 2005

Better to write as

where date >= '20050101' and date < '20060101'
===> means given a date range, use the index to seek the range
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜