开发者

Consecutive streak of dates

Hopefully this isn't a dupe of another question, but I couldn't see it anywhere else - also this is a simplified version of another question I asked, hopefully to get me started on working out how to approach it.

I am looking to work out consecutive ranges of payments where there has been at least one payment in each month.

I have the following sample data

CREATE TABLE #data
(
Contact_reference NVARCHAR(55)
,Date_payment DATETIME
,Payment_value MONEY
)

INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2003-06-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2004-06-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2004-12-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-04-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-05-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-06-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-07-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-08-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-09-08',12.82)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-10-10',12.8205)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-11-10',12.8205)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2005-12-10',12.8205)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2006-01-10',12.8205)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2006-02-10',12.8205)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2006-02-28',12.8205)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2006-04-12',12.8205)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2006-05-10',19.2308)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2007-06-11',19.2308)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2007-07-10',19.2308)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2007-08-09',19.2308)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2007-09-10',19.2308)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2007-10-09',19.2308)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2007-11-09',19.2308)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2007-12-10',19.2308)
INSERT INTO #data VALUES ('18EC3CD2-3065-4FF4-BE40-000004228590','2008-01-10',19.2308)

And what I would like to be able to do is to work out for each contact the ranges over which they gave consecutively (defined as giving at least once in every calendar month), the number of consecutive payments, the total value per range (and ideally if possible the gap between the current range and the end of the most recent one).

For the test data above my output would look like this:

CREATE TABLE #results
(
contact_reference NVARCHAR(55)
,Range_start DATETIME
,Range_end DATETIME
,Payments INT
,Value MONEY
,months_until_next_payment INT --works out the gap between the range_end date for a group and the range_start date for the next group
)
INSERT INTO #results VALUES('18EC3CD2-3065-4FF4-BE40-000004228590','2003-06-08','2003-06-08',1,12.82,12)
INSERT INTO #results VALUES('18EC3CD2-3065-4FF4-BE40-000004228590','2004-06-08','2004-06-08',1,12.82,6)
INSERT INTO #results VALUES('18EC3CD2-3065-4FF4-BE40-000004228590','2004-12-08','2004-12-08',1,12.82,4)
INSERT INTO #results VALUES('18EC3CD2-3065-4FF4-BE40-000004228590','2005-04-08','2006-02-28',12,153.843,2)
INSERT INTO #results VALUES('18EC3CD2-3065-4FF4-BE40-000004228590','2006-04-12','2008-06-06',27,416.6673,NULL)

I've looked for answers using islands, or iterati开发者_如何学运维ons but I quite frankly don't even know where to begin applying them to my question, so any help massively appreciated :)


Edit: I've added in the months_until_next_payment column. This would be more efficiently done in the application rather than with a self join however as SQL Server does not have any particularly satisfactory way of referencing next and previous rows.

;WITH base AS ( 
SELECT    Contact_reference  ,
          Payment_value,
          DATEPART(YEAR, Date_payment)*12 + DATEPART(MONTH, Date_payment) - 
               DENSE_RANK() OVER 
                   (PARTITION BY Contact_reference 
                    ORDER BY DATEPART(YEAR, Date_payment)*12 + DATEPART(MONTH, Date_payment)) AS G,
          Date_payment
 FROM     #data
 ),
 cte AS
 (
 SELECT 
          Contact_reference, 
          ROW_NUMBER() over (partition by Contact_reference 
                                 order by MIN(Date_payment)) RN,
          MIN(Date_payment) Range_start,
          MAX(Date_payment) Range_end, 
          COUNT(Payment_value) Payments, 
          SUM(Payment_value) Value
 FROM base
 GROUP BY Contact_reference, G
 )
 SELECT 
       c1.Contact_reference, 
       c1.Payments, 
       c1.Range_end, 
       c1.Range_start, 
       c1.Value, 
       DATEDIFF(month, c1.Range_end,c2.Range_start) months_until_next_payment
 FROM cte c1
 LEFT join cte c2 ON c1.Contact_reference=c2.Contact_reference and c2.RN = c1.RN+1


You can do it using cursor. Language like c#/java are better choice for this problem.

DECLARE @date DATETIME
DECLARE @nextDate DATETIME
DECLARE @rangeStart DATETIME
DECLARE @rangeEnd DATETIME

DECLARE @value decimal(18,2)
DECLARE @valueSum decimal(18,2)
DECLARE @count int

DECLARE @PaymentCursor CURSOR

SET @PaymentCursor = CURSOR FOR
    SELECT Date_payment, Payment_value FROM #data
    ORDER BY Date_payment

OPEN @PaymentCursor
    FETCH NEXT FROM @PaymentCursor INTO @nextDate, @value   
    SET @date = @nextDate
    SET @rangeStart = @nextDate
    SET @valueSum = 0
    SET @count = 0

    WHILE (@@FETCH_STATUS = 0)
    BEGIN       
        FETCH NEXT FROM @PaymentCursor INTO @nextDate, @value

        SET @count = @count + 1
        SET @valueSum = @valueSum + @value      

        IF (DATEDIFF(mm, @date, @nextDate) > 1)
        BEGIN
            SELECT @rangeStart AS RangeStart, @date AS RangeEnd, @count AS Coount, @valueSum AS VALUE, DATEDIFF(mm, @date, @nextDate) AS months_until_next_payment
            SET @valueSum = 0               
            SET @count = 0
            SET @rangeStart = @nextDate
        END

        SET @date = @nextDate           
    END

    SELECT @rangeStart AS RangeStart, @date AS RangeEnd, @count AS Coount, @valueSum AS VALUE, null AS months_until_next_payment

CLOSE @PaymentCursor
DEALLOCATE @PaymentCursor
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜