Sql Query Pervious Row Optimisation

2023-01-11 00:32 问答作者：

Here is my table structure

MyTable
-----------

ObjectID int (Identity),           -- Primary Key
FileName varchar(10),
CreatedDate datetime
...........
...........
...........

I need to get the time taken to create record in a file... ie... Time elapsed between the previous record in the same file and the current record of the same file

ie... If the records are

ObjectID    FileName    CreatedDate (just showing the time part here)
--------    --------    -----------
1           ABC         10:23
2           ABC         10:25
3           DEF         10:26
4           ABC   开发者_如何学JAVA      10:30
5           DEF         10:31
6           DEF         10:35

The required output is...

ObjectID    FileName    CreatedDate     PrevRowCreatedDate
--------    --------    ----------- ---------------
1           ABC         10:23           NULL
2           ABC         10:25           10:23
3           DEF         10:26           NULL
4           ABC         10:30           10:25
5           DEF         10:31           10:26
6           DEF         10:35           10:31

So far I got this query, but it is taking a very long time than expected... Is there a better way to do it...

    Select  A.ObjectID, 
        A.FileName
        A.CreatedDate as CreatedDate, 
        B.PrevRowCreatedDate,
        datediff("SS", '1900-01-01 00:00:00', Coalesce((A.CreatedDate - B.PrevRowCreatedDate),0)) as secondsTaken
    from MyTable as A 
        Cross Apply (       
        (Select PrevRowCreatedDate = Max(CreatedDate) from MyTable as BB 
                        where   BB.FileName = A.FileName and 
                                BB.CreatedDate < A.CreatedDate
        )
        ) as B

Please let me know incase you need more information

Thanks

SELECT t1.FileName, t1.CreatedDate, t2.CreatedDate as PrevCreatedDate
FROM 
   (SELECT FileName, CreateDate,
          ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
   FROM MyTable) t1
LEFT JOIN
   (SELECT FileName, CreateDate,
     ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
     FROM MyTable) t2
ON (t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo - 1)

Or may be better use 'WITH', because queries is identical:

WITH t(ObjectID, FileName, CreatedDate, OrderNo) AS
   (SELECT ObjectID, FileName, CreatedDate,
          ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
   FROM MyTable) 
SELECT t1.ObjectID, t1.FileName, t1.CreatedDate, t2.CreatedDate AS PrevCreatedDate,
        DATEDIFF("SS", '1900-01-01 00:00:00', 
           COALESCE((t1.CreatedDate - t2.CreatedDate),0)) AS secondsTaken
FROM t t1 LEFT JOIN t t2 
ON (t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo + 1)

I think Michael's answer should indeed prove more efficient. When evaluating efficiency though I just want to draw attention to an issue with the query costs (relative to the batch) shown in Management Studio.

I set up a test table with 23,174 rows and ran the query in the question and Michael's. Looking at the "query cost (relative to the batch)" in the actual execution plan the original query was 1% and Michael's 99% cost and so appears to be massively inefficient.

Sql Query Pervious Row Optimisation

However the actual stats tell a completely different story

Cross Apply Approach

SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms.
Table 'MyTable'. Scan count 23175, logical reads 49335, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

ROW_NUMBER approach

SQL Server Execution Times: CPU time = 391 ms, elapsed time = 417 ms.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTable'. Scan count 2, logical reads 148, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

In the ROW_NUMBER plan the Merge Join on rownumber=rownumber+1 has 23,174 rows going in to both sides. This value is unique and actual rows out is 23,174 as well. However SQL Server estimates that the rows produced from that join will be 34,812,000 and thus its estimated cost for the insert later in the plan is wildly inaccurate.

Test Script

BEGIN TRAN

CREATE TABLE MyTable
  (
     [ObjectID]    [INT] IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
     [FileName]    [VARCHAR](50) NULL,
     [CreatedDate] [DATETIME] NULL
  )

GO

INSERT INTO MyTable
SELECT ISNULL(type, NEWID()),
       DATEADD(DAY, CAST(RAND(CAST(NEWID() AS VARBINARY)) * 10000 AS INT), GETDATE())
FROM   master.dbo.spt_values,
       (SELECT TOP 10 1 AS X FROM  master.dbo.spt_values) V


DELETE FROM MyTable
WHERE  EXISTS(SELECT *
              FROM   MyTable m2
              WHERE  MyTable.CreatedDate = m2.CreatedDate
                     AND MyTable.FileName = m2.FileName
                     AND MyTable.ObjectID < m2.ObjectID)

CREATE UNIQUE NONCLUSTERED INDEX [IX_MyTable]
  ON MyTable ([FileName] ASC, [CreatedDate] ASC)

SET STATISTICS IO ON
SET STATISTICS TIME ON

SELECT A.ObjectID,
       A.FileName,
       A.CreatedDate                                                                                AS CreatedDate,
       B.PrevRowCreatedDate,
       DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE(( A.CreatedDate - B.PrevRowCreatedDate ), 0)) AS secondsTaken
INTO   #A
FROM   MyTable AS A
       CROSS APPLY ((SELECT PrevRowCreatedDate = MAX(CreatedDate)
                     FROM   MyTable AS BB
                     WHERE  BB.FileName = A.FileName
                            AND BB.CreatedDate < A.CreatedDate)) AS B;

WITH t(ObjectID, FileName, CreatedDate, OrderNo)
     AS (SELECT ObjectID,
                FileName,
                CreatedDate,
                RANK() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
         FROM   MyTable)
SELECT t1.ObjectID,
       t1.FileName,
       t1.CreatedDate,
       t2.CreatedDate                                                                          AS PrevCreatedDate,
       DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE(( t1.CreatedDate - t2.CreatedDate ), 0)) AS secondsTaken
INTO   #B
FROM   t t1
       LEFT JOIN t t2
         ON ( t1.FileName = t2.FileName
              AND t1.OrderNo = t2.OrderNo + 1 )

/*Test the 2 queries give the same result*/
SELECT *
FROM   #A
EXCEPT
SELECT *
FROM   #B

SELECT *
FROM   #B
EXCEPT
SELECT *
FROM   #A

ROLLBACK

继续阅读：query-optimization sql sql-server-2005

Sql Query Pervious Row Optimisation

Cross Apply Approach

ROW_NUMBER approach

Test Script

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Cross Apply Approach

ROW_NUMBER approach

Test Script

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？