SQL Server 2008: Very slow query performance when dealing with specific date boundary

2022-12-11 08:07 问答作者：

Right, this is an odd one.

We have a web service that returns data to a silverlight client. The queries are generated against a SQL Server 2008 database using the entity framework. Most of these queries are date range based - pull results between this date and that, for example. Also, a view is used to make things a little easier.

We have noticed that when issuing a specific set of queries that start on or after a certain date, execution will be very slow. This date is the 5th November, any year. If our start date is one day earlier execution will be instant. So, 5th - 6th, slow. 4th - 6th - quick.

Here is the SQL that EF generates:

THIS QUERY WILL BE VERY SLOW (30 seconds)

SELECT 
1 AS [C1], 
[GroupBy1].[K1] AS [Name], 
[GroupBy1].[A1] AS [C2]
FROM ( SELECT 
[Extent1].[Name] AS [K1], 
SUM([Extent1].[Value]) AS [A1]
FROM (SELECT 
  [view_answers].[Value] AS [Value], 
  [view_answers].[Comment] AS [Comment], 
  [view_answers].[NewStockist] AS [NewStockist], 
  [view_answers].[NewDistPoint] AS [NewDistPoint], 
  [view_answers].[VoucherUsed] AS [VoucherUsed], 
  [view_answers].[CashTotal] AS [CashTotal], 
  [view_answers].[AnswerType] AS [AnswerType], 
  [view_answers].[StartTime] AS [StartTime], 
  [view_answers].[ActualEndTime] AS [ActualEndTime], 
  [view_answers].[Complete] AS [Complete], 
  [view_answers].[UserID] AS开发者_运维技巧 [UserID], 
  [view_answers].[UserName] AS [UserName], 
  [view_answers].[QuestionType] AS [QuestionType], 
  [view_answers].[ProductSKU] AS [ProductSKU], 
  [view_answers].[BrandID] AS [BrandID], 
  [view_answers].[TeamID] AS [TeamID], 
  [view_answers].[Name] AS [Name], 
  [view_answers].[Stage] AS [Stage], 
  [view_answers].[Question] AS [Question]
  FROM [dbo].[view_answers] AS [view_answers]) AS [Extent1]
  WHERE 
     ([Extent1].[UserID] = '16E3692F-806E-40A0-BB99-ABBBCC13060D') 
     AND (N'Distribution Points' = [Extent1].[QuestionType]) 
     AND ([Extent1].[StartTime] >= '11/05/2009 00:00:00') 
     AND ([Extent1].[StartTime] <= '11/08/2009 00:00:00') 
     AND (1 = [Extent1].[Complete]) 
     AND (2 = [Extent1].[BrandID]) 
     AND (N'Distribution Points' = [Extent1].[QuestionType])
  GROUP BY 
     [Extent1].[Name])  
  AS [GroupBy1]

THIS ONE WILL BE MUCH QUICKER

    SELECT 
    1 AS [C1], 
    [GroupBy1].[K1] AS [Name], 
    [GroupBy1].[A1] AS [C2]
    FROM ( SELECT 
    [Extent1].[Name] AS [K1], 
    SUM([Extent1].[Value]) AS [A1]
    FROM (SELECT 
      [view_answers].[Value] AS [Value], 
      [view_answers].[Comment] AS [Comment], 
      [view_answers].[NewStockist] AS [NewStockist], 
      [view_answers].[NewDistPoint] AS [NewDistPoint], 
      [view_answers].[VoucherUsed] AS [VoucherUsed], 
      [view_answers].[CashTotal] AS [CashTotal], 
      [view_answers].[AnswerType] AS [AnswerType], 
      [view_answers].[StartTime] AS [StartTime], 
      [view_answers].[ActualEndTime] AS [ActualEndTime], 
      [view_answers].[Complete] AS [Complete], 
      [view_answers].[UserID] AS [UserID], 
      [view_answers].[UserName] AS [UserName], 
      [view_answers].[QuestionType] AS [QuestionType], 
      [view_answers].[ProductSKU] AS [ProductSKU], 
      [view_answers].[BrandID] AS [BrandID], 
      [view_answers].[TeamID] AS [TeamID], 
      [view_answers].[Name] AS [Name], 
      [view_answers].[Stage] AS [Stage], 
      [view_answers].[Question] AS [Question]
      FROM [dbo].[view_answers] AS [view_answers]) AS [Extent1]
    WHERE 
       ([Extent1].[UserID] = '16E3692F-806E-40A0-BB99-ABBBCC13060D') 
       AND (N'Distribution Points' = [Extent1].[QuestionType]) 
       AND ([Extent1].[StartTime] >= '11/04/2009 00:00:00') 
       AND ([Extent1].[StartTime] <= '11/08/2009 00:00:00') 
       AND (1 = [Extent1].[Complete]) 
       AND (2 = [Extent1].[BrandID]) 
       AND (N'Distribution Points' = [Extent1].[QuestionType])
    GROUP BY 
       [Extent1].[Name])  
    AS [GroupBy1]

If we set the start date to the 5th Nov last year, execution will be slow, the 4th Nov last year, and its fast again. Looking at the data in the database there is nothing unusual around the 5th. Also, it appears that queries where the start date is after the 5th will run slow.

I'm stumped!

(the database is hosted remotely so I don't have direct access to it)

UPDATE

Thanks for thr replied guys. Firsltly, I should probably make it clear that my knowledge of sql server is skin deep. I build databases of varying degress of quality and then use something like Linq To SQL or EF to work with them. So I feel a little out of my depth here.

Regarding the joins - the view I'm querying does include data from about 6-7 tables. I shall try and grab some stats next time at work and add some more info here. I dont really know much about execution plans, or is they are something I can see through SQL Server Management Studio

UPDATE Stats from slow query (3 row(s) affected)

Table 'tblProducts'. Scan count 0, logical reads 22, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblQuestionTypes'. Scan count 0, logical reads 1496, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblQuestions'. Scan count 0, logical reads 1496, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblSessions'. Scan count 0, logical reads 28551, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAnswers'. Scan count 1, logical reads 1976256, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblCalls'. Scan count 1, logical reads 439, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblUsers'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Stats from fast query

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblAnswers'. Scan count 1, logical reads 7008, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblProducts'. Scan count 0, logical reads 22, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblQuestions'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblQuestionTypes'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblCalls'. Scan count 1, logical reads 439, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblSessions'. Scan count 1, logical reads 47, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblUsers'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

UPDATE The query execution plan (a new thing to me) is suggesting that I add a new index to a row in one of the tables - questionid in the answers table. The suggested sql is as follows:

USE [brandfourcoke]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[tblAnswers] ([QuestionID])
INCLUDE ([CallID],[Value])
GO

Estimated improvements in query cost are 93%. Should I go ahead and do this? The database is in a live environment and uses sql server 2008's change tracking and the sync framework. As such Im always concerned that database changes will impact on the tracked data and break things for my clients. Will adding an index reset change tracking data? Thanks for all your help guys. I feel like a complete beginner here.

Run the slow and fast query with SET sTATISTICS IO ON and see if there is a significant difference in the amount of logical reads/physical reads between the two.

Most likely there is a strong skew in the data. For instance the plan on the fast one does a nested loop driven by a result of 10 rows (resulting in 10 nested lookups), while the slow one suddenly sees 10000 rows from where the previous one has seen 10, resulting in 10000 lookups. Although your query has no joins, the engine may use various access indexes and join the indexes with the cluster index. The actual execution plan will always reveal exactly what's going on.

Without knowing anything additional, first guess would be that you are likely either hitting the tipping point of index usage for a particular operation or have outdated/wrong statistical distribution information for given column(s). Could be a few other things as well such as suboptimal indexing (perhaps wrong, perhaps key ordered incorrectly, etc.), given that you are using Sql 2008 are you sure you haven't created a filtered index rather than a traditional full index on the particular column(s), etc. - however, in order to determine that, we'd need to see a lot more information (i.e. schema, indexes, query plans, data distribution, statistics, etc.).

Might help if you could post the query plan used for each of the queries listed above, that would at least help us determine if you are getting drastically different plans.

That query is one reason why using generated sql is often a terrible idea. As far as I can tell this is the equivalent query if you had written it yourself:

     SELECT 1 AS [C1],     
            [view_answers].[Name] AS [K1],     
            SUM([view_answers].[Value]) AS [C2]   
     FROM  [view_answers] 
     WHERE ([view_answers].[UserID] = '16E3692F-806E-40A0-BB99-ABBBCC13060D')        
     AND (N'Distribution Points' = [view_answers].[QuestionType])        
     AND ([view_answers].[StartTime] >= '11/04/2009 00:00:00')        
     AND ([view_answers].[StartTime] <= '11/08/2009 00:00:00')        
     AND (1 = [view_answers].[Complete])        
     AND (2 = [view_answers].[BrandID])        
     AND (N'Distribution Points' = [view_answers].[QuestionType])    
     GROUP BY        [view_answers].[Name]

Try this with both dates and see if you get the same results and the same delay when using Nov 5. What fields are indexed?

继续阅读：entity-framework sql-server sql-server-2008

SQL Server 2008: Very slow query performance when dealing with specific date boundary

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？