SQL to find time elapsed from multiple overlapping intervals
Not using MSSQL or DB2 or Oracle. No CTE. No OVERLAP predicate. No INTERVAL data type. The situation: on a vehicle to be repaired work can not start until all parts ordered for the job have been received. Parts may be ordered multiple times 开发者_如何转开发prior to the start of repair. We need to extract the time for which the vehicle was on "parts hold"
So for a vehicle identified as id = 1 parts were ordered (d1) and received (d2) on 4 different occasions
ID d1 d2
1 8/1 8/8
1 8/2 8/6
1 8/12 8/14
1 8/3 8/10
8/1 8/8
d1 d2
|-------------------------------|
8/2 8/6 8/12 8/14
d1 d2 d1 d2
|---------------| |----------|
8/3 8/10
d1 d2
|---------------------|
8/1 8/14
|---------------------------------------------------------| = 13 days
8/10 8/12
|--------------------------------------| + |----------| = parts hold = 11 days
As seen from above, the wait time to start work (assuming 8/1 as the date from which the vehicle was available for work) was 13 days. The actual time spent waiting for parts was 11 days, which is the number we need to derive from the data. The actual datetime data will be timestamps from which we will extract hours, we used dates in this sample data for simplicity of presentation. We are struggling to generate a set (not psm, not udf, not cursor) based solution. TIA
I couldn't get @Alex W's queries to work. It is not standard SQL, so it required a lot of rewrite to be compatible with SQL Server (which I can test). But it did give me some inspiration, which I have expanded upon.
Find all start-points of every period of uninterrupted waiting:
SELECT DISTINCT
t1.ID,
t1.d1 AS date,
-DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d1) AS n
FROM Orders t1
LEFT JOIN Orders t2 -- Join for any events occurring while this
ON t2.ID = t1.ID -- is starting. If this is a start point,
AND t2.d1 <> t1.d1 -- it won't match anything, which is what
AND t1.d1 BETWEEN t2.d1 AND t2.d2 -- we want.
GROUP BY t1.ID, t1.d1, t1.d2
HAVING COUNT(t2.ID) = 0
And the equivalent for end-points:
SELECT DISTINCT
t1.ID,
t1.d2 AS date,
DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d2) AS n
FROM Orders t1
LEFT JOIN Orders t2
ON t2.ID = t1.ID
AND t2.d2 <> t1.d2
AND t1.d2 BETWEEN t2.d1 AND t2.d2
GROUP BY t1.ID, t1.d1, t1.d2
HAVING COUNT(t2.ID) = 0
n
is the number of days since some common point in time. Start-points have a negative value, and end-points have a positive value. This is so that we can just add them up to get the number of days in between.
span = end - start
span = end + (-start)
span1 + span2 = end1 + (-start1) + end2 + (-start2)
Finally, we just need to add things up:
SELECT ID, SUM(n) AS hold_days
FROM (
SELECT DISTINCT
t1.id,
t1.d1 AS date,
-DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d1) AS n
FROM Orders t1
LEFT JOIN Orders t2
ON t2.ID = t1.ID
AND t2.d1 <> t1.d1
AND t1.d1 BETWEEN t2.d1 AND t2.d2
GROUP BY t1.ID, t1.d1, t1.d2
HAVING COUNT(t2.ID) = 0
UNION ALL
SELECT DISTINCT
t1.id,
t1.d2 AS date,
DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d2) AS n
FROM Orders t1
LEFT JOIN Orders t2
ON t2.ID = t1.ID
AND t2.d2 <> t1.d2
AND t1.d2 BETWEEN t2.d1 AND t2.d2
GROUP BY t1.ID, t1.d1, t1.d2
HAVING COUNT(t2.ID) = 0
ORDER BY ID, date
) s
GROUP BY ID;
Input table (Orders):
ID d1 d2
1 2011-08-01 2011-08-08
1 2011-08-02 2011-08-06
1 2011-08-03 2011-08-10
1 2011-08-12 2011-08-14
2 2011-08-01 2011-08-03
2 2011-08-02 2011-08-06
2 2011-08-05 2011-08-09
Output:
ID hold_days
1 11
2 8
Alternatively, you can do this with a stored procedure.
CREATE PROCEDURE CalculateHoldTimes
@ID int = 0
AS
BEGIN
DECLARE Events CURSOR FOR
SELECT *
FROM (
SELECT d1 AS date, 1 AS diff
FROM Orders
WHERE ID = @ID
UNION ALL
SELECT d2 AS date, -1 AS diff
FROM Orders
WHERE ID = @ID
) s
ORDER BY date;
DECLARE @Events_date date,
@Events_diff int,
@Period_start date,
@Period_accum int,
@Total_start date,
@Total_count int;
OPEN Events;
FETCH NEXT FROM Events
INTO @Events_date, @Events_diff;
SET @Period_start = @Events_date;
SET @Period_accum = 0;
SET @Total_start = @Events_date;
SET @Total_count = 0;
WHILE @@FETCH_STATUS = 0
BEGIN
SET @Period_accum = @Period_accum + @Events_diff;
IF @Period_accum = 1 AND @Events_diff = 1
-- Start of period
SET @Period_start = @Events_date;
ELSE IF @Period_accum = 0 AND @Events_diff = -1
-- End of period
SET @Total_count = @Total_count +
DATEDIFF(day, @Period_start, @Events_date);
FETCH NEXT FROM Events
INTO @Events_date, @Events_diff;
END;
SELECT
@Total_start AS d1,
@Events_date AS d2,
@Total_count AS hold_time;
END;
Call it with:
EXEC CalculateHoldTimes 1;
This SQL statement seems to get what you want (t is the table name of the sampe table):
SELECT
d.id,
d.duration,
d.duration -
IFNULL(
( SELECT Sum( timestampdiff( SQL_TSI_DAY,
no_hold.d2,
( SELECT min(d1) FROM t t4
WHERE t4.id = no_hold.id and t4.d1 > no_hold.d2 )))
FROM ( SELECT DISTINCT id, d2 FROM t t1
WHERE ( SELECT sum( IIF( t1.d2 between t2.d1 and t2.d2, 1, 0 ) )
FROM t t2 WHERE t2.id = t1.id and t2.d2 <> t1.d2 ) = 0
And d2 <> ( select max( d2 ) from t t3 where t3.id = t1.id )) no_hold
WHERE no_hold.id = d.id ),
0 ) "parts hold"
FROM
( SELECT id, timestampdiff( SQL_TSI_DAY, min( d1 ), max( d2 ) ) duration
FROM t GROUP BY id ) d
The outer query gets the duration of the repair work. The complex subquery calculates the total number of days not waiting for parts. This is done by locating the start dates where the vehicle is not waiting for parts, and then count the number of days until it begins to wait for parts again:
// 1) The query for finding the starting dates when the vehicle is not waiting for parts,
// i.e. finding all d2 that is not within any date range where the vehicle is waiting for part.
// The DISTINCT is needed to removed duplicate starting "no hold" period.
SELECT DISTINCT id, d2
FROM t t1
WHERE ( SELECT sum( IIF( t1.d2 between t2.d1 and t2.d2, 1, 0 ) ) from t t2
WHERE t2.id = t1.id and t2.d2 <> t1.d2 ) = 0 AND
d2 <> ( SELECT max( d2 ) FROM t t3 WHERE t3.id = t1.id ) )
// 2) The days where it vehicle is not waiting for part is the date from the above query till the vehicle is // waiting for part again
timestampdiff( SQL_TSI_DAY, no_hold.d2, ( SELECT min(d1) FROM t t4 WHERE t4.id = no_hold.id and t4.d1 > no_hold.d2 ) )
Combining the two above and aggregating all such periods gives the number of days that the vehicle is not waiting for parts. The final query adds an extra condition to calculate result for each id from the outer query.
This probably is not terribly efficient on very large table with many ids. It should fine if the id is limited to one or just a few.
USE [DnnMasterShoraSystem]
GO
/****** Object: StoredProcedure [dbo].[CalculateHoldTimes] Script Date: 12/8/2014 1:36:12 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[CalculateHoldTimes]
@PID int
AS
BEGIN
CREATE TABLE #tblTemp(
[ID] [int] NOT NULL,
[PID] [int] NOT NULL,
[BID] [int] NOT NULL,
[Active] [bit] NULL,
[WorkStartDate] [nvarchar](10) NULL,
[WorkEndDate] [nvarchar](10) NULL,
[jobStateID] [int] NULL,
[RegisterType] [int] NULL,
[RegisterState] [int] NULL,
[En_time] [datetime] NULL,
[Fa_time] [nvarchar](40) NULL,
[Status] [nvarchar](100) NULL,
[PortalId] [int] NULL,
[ModuleId] [int] NULL,
[UserId] [int] NULL,
[BrName] [nvarchar](150) NULL,
[BrCode] [nvarchar](20) NULL,
[WorkEndDate_New] [nvarchar](10) NULL
) ON [PRIMARY]
insert into #tblTemp
select * from [dbo].[Shora.Personel_Branch_Copy]
where WorkStartDate is not null
--and [dbo].[ShamsiToMiladi](WorkStartDate) <GETDATE()
--and [dbo].[ShamsiToMiladi](WorkEndDate) <GETDATE()
and PID=@PID
--and [dbo].[ShamsiToMiladi](WorkEndDate)<[dbo].[ShamsiToMiladi](@NewDate)
order by WorkStartDate
DECLARE Events CURSOR FOR
SELECT [dbo].[ShamsiToMiladi](WorkStartDate) AS StartDate,[dbo].[ShamsiToMiladi](WorkEndDate) AS EndDate
FROM #tblTemp
ORDER BY StartDate;
--drop table #tblTemp
DECLARE @SDate date,
@EDate date,
@Period_Start date,
@Period_End date,
@Total int,
@OldSDate date,
@OldEDate date
OPEN Events;
FETCH NEXT FROM Events
INTO @SDate, @EDate;
set @Total=0
SET @Period_Start =@SDate
set @Period_End=@EDate
WHILE @@FETCH_STATUS = 0
BEGIN
if @OldSDate>@Period_End
begin
set @Period_Start=@SDate
if @Period_End>=@Period_Start
set @Total+=DATEDIFF(DAY,@Period_Start,@Period_End)
end
else if @SDate<@Period_End
begin
set @Period_Start=@Period_Start
set @Total=DATEDIFF(DAY,@Period_Start,@Period_End)
end
set @OldSDate=@SDate
set @OldEDate=@EDate
FETCH NEXT FROM Events
INTO @SDate, @EDate;
if @Period_End<@EDate
set @Period_End=@EDate
END;
INSERT INTO [dbo].[PersonelDays]
(PID
,[Total_Start]
,[Total_End]
,[Total_count])
VALUES
(@PID,
@Period_Start,
@Period_End,
@Total
)
drop table #tblTemp
CLOSE Events
DEALLOCATE Events
END;
精彩评论