Why Does Select Top X Kill Performance

2023-02-02 22:29 问答作者：

I have the T-SQL Query below (generated by linq-to-sql, so it looks a bit weird.) When I remove the top 10, it runs in under 1 second. When I leave the top 10 in, it takes 68 seconds.

Without the top 10, the total number of rows is only 19.

Why does the top 10 destroy performance so much in this situation?

SELECT TOP (10) [t3].[value] AS [StartDate], [t3].[AH_TimeStamp4] AS [EndDate], [t3].[value2] AS [RunTime], [t3].[AH_Idnr] AS [RunNumber], [t3].[AH_Status] AS [StatusCode], [t3].[STATUS_DESC] AS [StatusDesc], [t3].[STATUS_TYPE] AS [StatusType]
FROM (
    SELECT COALESCE([t1].[AH_TimeStamp2],[t1].[AH_TimeStamp1]) AS [value], [t1].[AH_TimeStamp4], CONVERT(BigInt,(((CONVERT(BigInt,DATEDIFF(DAY, [t1].[AH_TimeStamp1], [t1].[AH_TimeStamp4]))) * 86400000) + DATEDIFF(MILLISECOND, DATEADD(DAY, DATEDIFF(DAY, [t1].[AH_TimeStamp1], [t1].[AH_TimeStamp4]), [t1].[AH_TimeStamp1]), [t1].[AH_TimeStamp4])) * 10000) AS [value2], [t1].[AH_Idnr], [t1].[AH_Status], [t2].[STATUS_DESC], [t2].[STATUS_TYPE], [t0].[OH_Name]
    FROM [dbo].[OH] AS [t0]
    INNER JOIN [dbo].[AH] AS [t1] ON ([t0].[OH_Idnr]) = [t1].[AH_OH_Idnr]
    INNER JOIN [dbo].[CHK_JOB_STATUS_CODE] AS [t2] ON [t1].[AH_Status] = ([t2].[STATUS_CODE])
    ) AS [t3]
WHERE [t3].[OH_Name] = @p0
ORDER BY [t3].[AH_Idnr] DESC

Edit -> as requested, here's the linq-generated query without the Take(10)

SELECT [t3].[value] AS [StartDate], [t3].[AH_TimeStamp4] AS [EndDate], [t3].[value2] AS [RunTime], [t3].[AH_Idnr] AS [RunNumber], [t3].[AH_Status] AS [StatusCode], [t3].[STATUS_DESC] AS [StatusDesc], [t3].[STATUS_TYPE] AS [StatusType]
FROM (
    SELECT COALESCE([t1].[AH_TimeStamp2],[t1].[AH_TimeStamp1]) AS [value], [t1].[AH_TimeStamp4], CONVERT(BigInt,(((CONVERT(BigInt,DATEDIFF(DAY, [t1].[AH_TimeStamp1], [t1].[AH_TimeStamp4]))) * 86400000) + DATEDIFF(MILLISECOND, DATEADD(DAY, DATEDIFF(DAY, [t1].[AH_TimeStamp1], [t1].[AH_TimeStamp4]), [t1].[AH_TimeStamp1]), [t1].[AH_TimeStamp4])) * 10000) AS [value2], [t1].[AH_Idnr], [t1].[AH_Status], [t2].[STATUS_DESC], [t2].[STATUS_TYPE], [t0].[OH_Name]
    FROM [dbo].[OH] AS [t0]
    INNER JOIN [dbo].[AH] AS [t1] ON ([t0].[OH_Idnr]) = [t1].[AH_OH_Idnr]
    INNER JOIN [dbo].[CHK_JOB_STATUS_CODE] AS [t2] ON [t1].[AH_Status] = ([t2].[STATUS_CODE])
    ) AS [t3]
WHERE [t3].[OH_Name] = @p0
ORDER BY [t3].[AH_Idnr] DESC

Edit 2 -> here's the LINQ Query - nothing special, just 3 joins

List<UC4Status> statusList = uc4DB.OHs.Where(o => o.OH_Name == jobName).Join(uc4DB.AHs, oh => oh.OH_Idnr, ah => ah.AH_OH_Idnr, (oh, ah) => ah)
                 开发者_运维技巧       .Join(uc4DB.CHK_JOB_STATUS_CODEs, ah => ah.AH_Status, job => job.STATUS_CODE, (a, s) =>
                            new UC4Status
                            {
                                StartDate = a.AH_TimeStamp2 ?? a.AH_TimeStamp1,
                                EndDate = a.AH_TimeStamp4,
                                RunTime = a.AH_TimeStamp4 - a.AH_TimeStamp1,
                                StatusType = s.STATUS_TYPE,
                                StatusDesc = s.STATUS_DESC,
                                StatusCode = a.AH_Status,
                                RunNumber = a.AH_Idnr
                            }).OrderByDescending(r => r.RunNumber).Take(maxResults).ToList();

More important than the number of rows returned are the number of rows in each table joined. Without access to the query plan it is hard to say what the issue is. But, the issue could likely be due to outdated or missing column statistics combined with a lack of good indexes causing the query optimizer make a very bad decision when top is used. For example, if the optimizer incorrectly makes the assumption that the top statement will only operate on a small number of rows in each joined table, it could select to use inner loop joins with table scans on each joined table -- which could be a horribly slow operation on a large number of records (but can work very efficiently when working with only few records). In the case without top, the optimizer could correctly assume that it will have to scan a large volume of records in the joined tables and use an approach that better suits such an operation, such as a hash match join from a table scan.

You can try updating statistics and adding other indexes to see if that helps.

What's the number of rows without the top 10? Are there any indexes implemented? I would guess it's due to the subquery joins of t0, t1, and t2... Can you try something like,


select top 10 * from (
  select t3.value....
  from ( 
    select coalesce... )
  where ... order by ... 
)

继续阅读：sql sql-server-2005 tsql

Why Does Select Top X Kill Performance

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？