Find Date From Non Contiguous Ranges
I am looking to find an best way to find a date from date ranges that may or may not be contiguous (I am trying to avoid a cursor, or a heavy function if possible).
Lets say I have hotel guests that come and go (check in, check out). I want to find the date that a certain guest stayed their 45th night with us. The database we use records the data as so:
Create Table #GuestLog(
ClientId int,
StartDate DateTime,
EndDate DateTime)
Here is some data
Insert Into #GuestLog Values(1, '01/01/2010', '01/10/2010')
Insert Into #GuestLog Values(1, '01/16/2010', '01/29/2010')
Insert Into #GuestLog Values(1, '02/13/2010', '02/26/2010')
Insert Into #GuestLog Values(1, '04/05/2010', '06/01/2010')
Insert Into #GuestLog Values(1, '07/01/2010', '07/21/2010')
So far I can only think of solutions that involve functions with temp tables and crazy stuff like that, I feel like I'm over thinking it.
Thanks ahead of time.
EDIT: slight mod of @Andriy M's solution.
DECLARE @ClientID int, @NightNo int;
SET @ClientID = 1;
SE开发者_开发知识库T @NightNo = 45;
SELECT *
FROM ( SELECT gl.ClientId
, Date = gl.StartDate + v.number - 1
, rownum = ROW_NUMBER() OVER ( PARTITION BY gl.ClientId ORDER BY gl.StartDate, v.Number)
FROM #GuestLog gl
INNER JOIN master..spt_values v ON v.type = 'P'
AND v.number BETWEEN 1 AND gl.EndDate - gl.StartDate + 1) as s //--added "+ 1"
WHERE ClientId = @ClientId
AND rownum = @NightNo
Following Jeremy Pridemore's good example, I've parameterised my solution too (why not indeed?).
One note: because you've said '45th night', I undestand that means the date preceding the night should be taken. If I am wrong there, then just remove the - 1
part where Date
is calculated.
DECLARE @ClientID int, @NightNo int;
SET @ClientID = 1;
SET @NightNo = 45;
SELECT *
FROM (
SELECT
gl.ClientId,
Date = gl.StartDate + v.number - 1,
rownum = ROW_NUMBER() OVER (
PARTITION BY gl.ClientId
ORDER BY gl.StartDate, v.Number
)
FROM #GuestLog gl
INNER JOIN master..spt_values v ON v.type = 'P'
AND v.number BETWEEN 1 AND gl.EndDate - gl.StartDate
) s
WHERE ClientId = @ClientId
AND rownum = @NightNo
I cant't test my code where I'm sitting, but the following should work:
First, generate a tally table (make it permanent if you can).
Then use it to flatten the date range like this:
SELECT DATEADD(d,n.number,'01/01/2000') AS StayedDate
FROM numbers n
INNER JOIN #GuestLog g ON DATEADD(d,n.number,'01/01/2000') BETWEEN g.StartDate AND g.EndDate)
ORDER BY n.number
then add a CTE with ROW_NUMBER() to access the 45th row.
If you have those query types often, make an additional date table (just like the number table, but with dates), to get rid of the ugly DATEADDs.
Try this(I hate using inline queries for columns but couldb't think of any other route.):
WITH logd
AS (SELECT a.*,
(SELECT SUM(Datediff(d, startdate, enddate))
FROM #guestlog b
WHERE b.clientid = a.clientid
AND b.startdate <= a.startdate) dayssofar
FROM #guestlog a)
SELECT a.*,
Dateadd(d, ( 45 - dayssofar ), enddate)
FROM (SELECT b.*,
Row_number() OVER(PARTITION BY clientid ORDER BY dayssofar)rn
FROM logd b
WHERE dayssofar > 44) a
WHERE rn = 1
I tested my solution in SQL Server 2008 R2 on a database with 90 compatibility (SQL Server 2005) so I believe this will do what you want it to:
-- PLEASE NOTE: MAXRECURSION at the bottom needs to have a number that is higher than the
-- number of stored stays that any guest this will run on will have. Otherwise you'll need
-- to find a way to do this without recursion.
-- Parameterized because...why not? :)
DECLARE @CustomerID INT = 1
, @NthStayDay INT = 45;
-- This does nothing but get the rows out of GuestLog that we care about. From my experience
-- it's a good idea to do a simple data grab from a physical table or indexed view using
-- a seek, then play with that smaller subset of data in other CTE's. Though I'm sure that
-- those with more performance knowledge could give better answers. RowNumber is added for
-- recursion in the next CTE.
WITH OrderedStays(RowNumber, StartDate, EndDate) AS
(
SELECT
ROW_NUMBER() OVER(ORDER BY StartDate) AS RowNumber
, StartDate
, EndDate
FROM @GuestLog GuestLog
WHERE GuestLog.ClientId = @CustomerID
)
-- This is a recusive CTE, but I don't imagine it will preform to badly because there is no IO
-- at this point, simply processing the previous CTE. You'll have to be the judge of that.
-- The purpose of this CTE is to be able to limit down to the start date that we care about.
, StayRanges(RowNumber, StartDate, EndDate, FirstDayCount, LastDayCount) AS
(
-- This is our anchor row. It is the first date range at which the guest stayed with you.
-- The DATEDIFF returns 9 with dates of 20100101 - 20100110, but since I don't think the
-- 0th day stayed makes sense, I'm making it return 10 in that case since we're starting
-- at 1.
SELECT
RowNumber
, StartDate
, EndDate
, 1 AS FirstDayCount
, DATEDIFF(DAY, StartDate, EndDate) + 1 AS LastDayCount
FROM OrderedStays
WHERE RowNumber = 1
UNION ALL
-- This is the recursion. This joins the first CTE on what we have where the first CTE's
-- RowNumber is 1 more than whatever is in our StayRanges CTE. The column logic is the
-- same as above, but now we need to add in the LastDayCount from our previous iteration.
SELECT
OrderedStays.RowNumber
, OrderedStays.StartDate
, OrderedStays.EndDate
, StayRanges.LastDayCount + 1 AS FirstDayCount
, DATEDIFF(DAY, OrderedStays.StartDate, OrderedStays.EndDate) + StayRanges.LastDayCount + 1 AS LastDayCount
FROM OrderedStays
INNER JOIN StayRanges
ON (StayRanges.RowNumber + 1) = OrderedStays.RowNumber
)
-- Now that we have our ranges, we can select the range that has the day we want in it with a
-- simple between. Once that's done, take out the FirstDayCount from the day we care about so
-- that you're left with the difference from the StartDate and the date we want, and add that
-- to the StartDate. Done!
SELECT
DATEADD(DAY, @NthStayDay - FirstDayCount, StartDate) AS DateOfNthStayDate
FROM StayRanges
WHERE @NthStayDay BETWEEN FirstDayCount AND LastDayCount
OPTION(MAXRECURSION 5000)
精彩评论