Sql Query Pervious Row Optimisation
Here is my table structure
MyTable
-----------
ObjectID int (Identity), -- Primary Key
FileName varchar(10),
CreatedDate datetime
...........
...........
...........
I need to get the time taken to create record in a file... ie... Time elapsed between the previous record in the same file and the current record of the same file
ie... If the records are
ObjectID FileName CreatedDate (just showing the time part here)
-------- -------- -----------
1 ABC 10:23
2 ABC 10:25
3 DEF 10:26
4 ABC 开发者_如何学JAVA 10:30
5 DEF 10:31
6 DEF 10:35
The required output is...
ObjectID FileName CreatedDate PrevRowCreatedDate
-------- -------- ----------- ---------------
1 ABC 10:23 NULL
2 ABC 10:25 10:23
3 DEF 10:26 NULL
4 ABC 10:30 10:25
5 DEF 10:31 10:26
6 DEF 10:35 10:31
So far I got this query, but it is taking a very long time than expected... Is there a better way to do it...
Select A.ObjectID,
A.FileName
A.CreatedDate as CreatedDate,
B.PrevRowCreatedDate,
datediff("SS", '1900-01-01 00:00:00', Coalesce((A.CreatedDate - B.PrevRowCreatedDate),0)) as secondsTaken
from MyTable as A
Cross Apply (
(Select PrevRowCreatedDate = Max(CreatedDate) from MyTable as BB
where BB.FileName = A.FileName and
BB.CreatedDate < A.CreatedDate
)
) as B
Please let me know incase you need more information
Thanks
SELECT t1.FileName, t1.CreatedDate, t2.CreatedDate as PrevCreatedDate
FROM
(SELECT FileName, CreateDate,
ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
FROM MyTable) t1
LEFT JOIN
(SELECT FileName, CreateDate,
ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
FROM MyTable) t2
ON (t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo - 1)
Or may be better use 'WITH', because queries is identical:
WITH t(ObjectID, FileName, CreatedDate, OrderNo) AS
(SELECT ObjectID, FileName, CreatedDate,
ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
FROM MyTable)
SELECT t1.ObjectID, t1.FileName, t1.CreatedDate, t2.CreatedDate AS PrevCreatedDate,
DATEDIFF("SS", '1900-01-01 00:00:00',
COALESCE((t1.CreatedDate - t2.CreatedDate),0)) AS secondsTaken
FROM t t1 LEFT JOIN t t2
ON (t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo + 1)
I think Michael's answer should indeed prove more efficient. When evaluating efficiency though I just want to draw attention to an issue with the query costs (relative to the batch) shown in Management Studio.
I set up a test table with 23,174 rows and ran the query in the question and Michael's. Looking at the "query cost (relative to the batch)" in the actual execution plan the original query was 1% and Michael's 99% cost and so appears to be massively inefficient.
However the actual stats tell a completely different story
Cross Apply Approach
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms.
Table 'MyTable'. Scan count 23175, logical reads 49335, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
ROW_NUMBER approach
SQL Server Execution Times: CPU time = 391 ms, elapsed time = 417 ms.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTable'. Scan count 2, logical reads 148, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
In the ROW_NUMBER
plan the Merge Join on rownumber=rownumber+1
has 23,174 rows going in to both sides. This value is unique and actual rows out is 23,174 as well. However SQL Server estimates that the rows produced from that join will be 34,812,000 and thus its estimated cost for the insert later in the plan is wildly inaccurate.
Test Script
BEGIN TRAN
CREATE TABLE MyTable
(
[ObjectID] [INT] IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
[FileName] [VARCHAR](50) NULL,
[CreatedDate] [DATETIME] NULL
)
GO
INSERT INTO MyTable
SELECT ISNULL(type, NEWID()),
DATEADD(DAY, CAST(RAND(CAST(NEWID() AS VARBINARY)) * 10000 AS INT), GETDATE())
FROM master.dbo.spt_values,
(SELECT TOP 10 1 AS X FROM master.dbo.spt_values) V
DELETE FROM MyTable
WHERE EXISTS(SELECT *
FROM MyTable m2
WHERE MyTable.CreatedDate = m2.CreatedDate
AND MyTable.FileName = m2.FileName
AND MyTable.ObjectID < m2.ObjectID)
CREATE UNIQUE NONCLUSTERED INDEX [IX_MyTable]
ON MyTable ([FileName] ASC, [CreatedDate] ASC)
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT A.ObjectID,
A.FileName,
A.CreatedDate AS CreatedDate,
B.PrevRowCreatedDate,
DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE(( A.CreatedDate - B.PrevRowCreatedDate ), 0)) AS secondsTaken
INTO #A
FROM MyTable AS A
CROSS APPLY ((SELECT PrevRowCreatedDate = MAX(CreatedDate)
FROM MyTable AS BB
WHERE BB.FileName = A.FileName
AND BB.CreatedDate < A.CreatedDate)) AS B;
WITH t(ObjectID, FileName, CreatedDate, OrderNo)
AS (SELECT ObjectID,
FileName,
CreatedDate,
RANK() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo
FROM MyTable)
SELECT t1.ObjectID,
t1.FileName,
t1.CreatedDate,
t2.CreatedDate AS PrevCreatedDate,
DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE(( t1.CreatedDate - t2.CreatedDate ), 0)) AS secondsTaken
INTO #B
FROM t t1
LEFT JOIN t t2
ON ( t1.FileName = t2.FileName
AND t1.OrderNo = t2.OrderNo + 1 )
/*Test the 2 queries give the same result*/
SELECT *
FROM #A
EXCEPT
SELECT *
FROM #B
SELECT *
FROM #B
EXCEPT
SELECT *
FROM #A
ROLLBACK
精彩评论