开发者

Why does putting a WHERE clause outside view have terrible performance

Let's say you have a view:

CREATE VIEW dbo.v_SomeJoinedTables AS
SELECT
    a.date,
    a.Col1,
    b.Col2,
    DENSE_RANK开发者_运维百科() 
      OVER(PARTITION BY a.date, a.Col2 ORDER BY a.Col3) as Something
FROM a JOIN b on a.date = b.date

I've found that the performance of:

SELECT *
FROM v_SomeJoinedTables
WHERE date > '2011-01-01'

is much worse than

SELECT *, 
   DENSE_RANK() 
     OVER(PARTITION BY a.date, a.Col2 ORDER BY a.Col3) as Something
FROM a JOIN b ON a.date = b.date
WHERE a.date > '2011-01-01'

I'm very suprised that the query plan for these two statements are not the same.

I've also tried using an inline table valued function, but the query still takes 100-1000 times longer than the code where I copy and paste the view logic.

Any ideas?


It's called "Predicate pushing" aka deferred filtering.

SQL Server doesn't always realise the WHERE can be applied "earlier", inside the view effectively.

It has been mitigated in SQL Server 2008 to work more as expected


I'm not a SQL expert, so I may be voted down for my foolishness, but my guess is that in the first case SQL is fetching the results of the entire view before applying the predicate in the WHERE clause. So when you query the view, it selects all of the records, puts them in memory, and then applies the Date filter after it is done.

This seems similar to the way the entire data set specified in your joins is fetched prior to applying the filter in the WHERE (lesson here is that you should apply predicates in your ON clause when possible).

Unless views are treated differently somehow.


the OVER() syntax was brand-new in SS2005 and apparently not well integrated into the optimizer. I suggest you try a more traditional expression? Probably NOT an expression if you care about optimizability.

http://www.sqlteam.com/article/sql-sever-2005-using-over-with-aggregate-functions

Or, better, get a bit more familiar with the profiler - the view should be fixable.


Technically, you're not comparing between the same SQL statements. Your view indicates that it returns a.date, a.Col1, b.Col2, plus your DENSE_RANK() function. In your query without the view, you return all columns.

At first, you may think that returning all the columns would be worse. But it's difficult to determine which would be better without knowing what the table structure, including indexes, looks like.

Have you compared the query plans for each statement?


As a work-around I would suggest using a function instead of a view so that you can pass in data parameter.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜