SQL Count distinct times with a 30 minute difference
I'm trying to find a SQL query that will count the number of distinct start times that are at least 30 minutes different.
I have a number of employees that are paid a credit when they start work on at least three distinct times in a week, where the start time is at least 30 minutes different from the other start times. For example:
select count(distinct (CONVERT(VARCHAR(10), starttime, 108))), employeecode
from schedule
where CONVERT(VARCHAR(10), starttime, 108) >=
(select min(CONVERT(VARCHA开发者_StackOverflowR(10), dateadd (mi, 30, s2.starttime), 108)) from schedule s2)
group by starttime, employeecode
I am hoping to get a result with the employee code and the number of different and distinct start times. eg. Employeecode = 9999, Different Start times = 4 I have been bumbling through this and am yet to get something working...
Can anyone suggest where I am going wrong or a suitable solution that might help me? Thanks in advance for your help :)
While waiting for clarification on exact requirements I thought I'd suggest another approach. I'll put pros and cons with it...
If the start times are usually at or around a certain time (you're examples were always on the half hour or on the hour) then you can just split all the start times into what "band" they are in and then count the number of different bands.
eg 00:00-00:30 = Band 1 00:30-01:00 = Band 2 ... 07:00-07:30 = Band 15 ... 23:30-00:00 = Band 48
To get the bands you'd just need a simple (though quite lengthy) case statement.
The main problem with this approach is that it falls down when your times are next to the threshold. eg 07:29 and 07:31 would be in two different bands but are in fact only 2 minutes apart. This can be mitigated slightly if you are starting around the same time by making your bands start and finish at 15 and 45 mintues past each hour. Then if the start times are all in the middle of the bands then you will get it mostly right...
In my head though the problem isn't really one that suited for SQL so if you can do it in a different language that might be better...
You could probably do it in SQL with some tricky joins but I'm not capable of writing reliable SQL for it... Algorithmically though you want to do the following.
1) take earliest start time in the day and call that your first start time. 2) take the next earliest time that is at least 30 minutes later than the time from your previous step. 3) Repeat step 2 until you run out of times. 4) Count the times.
The problem with this from a SQL point of view is that it is trying to create data based off of a previous row which would mean doing stuff with cursors to loop through your times and storing things in variables.
[Update: Based on the poster's clarification of the problem in a comment on this answer, the problem I was solving with this answer is clearly not the problem the poster is trying to solve. I'm leaving the answer so as to show the solution to the other problem, and so as not to remove the comments which clarify the problem statement]
Break the problem into two parts: identifying the "unique" (within 30 minutes) starts and then counting them. The first part is the one I think you're having trouble with. Here is an approach:
SELECT employeecode, starttime FROM schedule S1
WHERE NOT EXISTS (SELECT * FROM schedule S2
WHERE S2.employeecode = S1.employeecode AND
S2.starttime > DATEADD(mi, -29, S1.starttime)
A few notes:
I copied the date math logic from your original query rather than looking up the syntax.
I assume starttime is DATETIME.
I used 29 minutes so that they'd get the bonus if the starttimes were 30 or more minutes apart (as stated in your problem statement). Actually, you should do this by doing the date math using seconds and subtracting (29 * 60) + 59. My version is slightly more generous to the employees than your problem statement specifies.
You can encapsulate this query in an view or inner query and do something like (assuming it's a view):
SELECT employeecode, count() FROM unique_starts_view WHERE starttime BETWEEN (beginning of period) AND (end of period) GROUP BY employeecode HAVING COUNT() >= 3
The NOT EXISTS technique can be slow so it's best to limit that query to the period you're interested in.
I'm assuming that your database product is SQL Server based on your OP but you did not mention the version. If you are using SQL Server 2005 and later you could try something like:
With StartTimes As
(
Select StartDateTime
, Row_Number() Over( Order By StartDateTime ) As Seq
, DatePart(hh, StartDateTime) * 60 + DatePart(mi, StartDateTime) As Minutes
From Schedule
)
Select *
From StartTimes As S1
Where Exists(
Select 1
From StartTimes As S2
Where S1.Seq <> 1
And Abs(S2.Minutes - S1.Minutes) >= 30
)
Using the time bands (not to be confused with Time Bandits) that Chris mentioned:
CREATE TABLE Start_Periods
(
begin_time TIME NOT NULL,
end_time TIME NOT NULL,
time_period TINYINT NOT NULL
CONSTRAINT PK_Start_Periods PRIMARY KEY CLUSTERED (begin_time),
CONSTRAINT CK_Start_Periods_begin_before_end CHECK (begin_time < end_time OR end_time = '00:00:00.000')
)
INSERT INTO Start_Periods (begin_time, end_time, time_period)
SELECT '00:00:00.000', '00:15:00.000', 1 UNION ALL
SELECT '00:15:00.000', '00:45:00.000', 2 UNION ALL
SELECT '00:45:00.000', '01:15:00.000', 3 UNION ALL
SELECT '01:15:00.000', '01:45:00.000', 4 UNION ALL
SELECT '01:45:00.000', '02:15:00.000', 5 UNION ALL
SELECT '02:15:00.000', '02:45:00.000', 6 UNION ALL
SELECT '02:45:00.000', '03:15:00.000', 7 UNION ALL
SELECT '03:15:00.000', '03:45:00.000', 8 UNION ALL
--...
SELECT '23:15:00.000', '23:45:00.000', 48 UNION ALL
SELECT '23:45:00.000', '00:00:00.000', 1
Your query then becomes:
SELECT
SCH.employee_code,
COUNT(DISTINCT SP.time_period) AS different_time_starts
FROM
Schedule SCH
INNER JOIN Start_Periods SP ON
SP.begin_time <= SCH.start_time AND
SP.end_time > SCH.start_time
GROUP BY
SCH.employee_code
Just to give you an idea:
- Build the SQL for only one week
- Do a
SELF JOIN
ofschedule
while sutracting the start times each from another - Now count all
DISTINCT
differences greater 30 Minutes for each customer -> gives you the number of start times
This should give you the desired result.
精彩评论