开发者

T-sql problem with running sum

I am trying to write T-sql script which will find "open" records for one table

Structure of data is following

Id (int PK)      Ts (datetime)       Art_id (int)     Amount (float)
1                '2009-01-01'        1                1
2                '2009-01-05'        1                -1
3                '2009-01-10'        1                1
4                '2009-01-11'        1                -1
5                '2009-01-13'        1                1
6                '2009-01-14'        1                1
7                '2009-01-15'        2                1
8                '2009-01开发者_C百科-17'        2                -1
9                '2009-01-18'        2                1

According to my needs I am trying to show only records after last sum for every one articles where 0 sorting by date of last running sum of zero value. So I am trying to abstract (show) records 5 and 6 for Art_id=1 and record 9 for art_id=2. I am using MSSQL2005 and my table has around 30K records with 6000 distinct values of ART_ID.


In this solution I simply want to find all the rows where there isn't a subsequent row for that Art_id where the running sum was 0. I am assuming we can use the ID as a better tiebreaker than TS, since two rows can come in with the same timestamp but they will get sequential identity values.

;WITH base AS
(
    SELECT
        ID, Art_id, TS, Amount,
        RunningSum = Amount + COALESCE
        (
            (
               SELECT SUM(Amount)
                FROM dbo.foo
                WHERE Art_id = f.Art_id 
                AND ID < f.ID
            )
            , 0
        )
    FROM dbo.[table name] AS f
)
SELECT ID, Art_id, TS, Amount
FROM base AS b1
WHERE NOT EXISTS
(
    SELECT 1 
    FROM base AS b2 
    WHERE Art_id = b1.Art_id
    AND ID >= b1.ID
    AND RunningSum = 0
)
ORDER BY ID;


Complete working query:

SELECT 
  * 
FROM TABLE_NAME E
JOIN
  (SELECT
    C.ART_ID,
    MAX(TS) MAX_TS
  FROM
    (SELECT 
      ART_ID,
      TS,
      COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
    FROM TABLE_NAME A) C
  WHERE C.ROW_SUM = 0
  GROUP BY C.ART_ID) D
ON 
  (D.ART_ID = E.ART_ID) AND
  (E.TS >= D.MAX_TS)

First we calculate running sums for every row:

SELECT 
  ART_ID,
  TS,
  COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
FROM TABLE_NAME A

Then we look for last article with 0:

SELECT
  C.ART_ID,
  MAX(TS) MAX_TS
FROM
  (SELECT 
    ART_ID,
    TS,
    COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
  FROM TABLE_NAME A) C
WHERE C.ROW_SUM = 0
GROUP BY C.ART_ID


You can find all rows where the running sum is zero with:

select cur.id, cur.art_id
from @articles cur
left join @articles prev
    on prev.art_id = cur.art_id
    and prev.id <= cur.id
group by cur.id, cur.art_id
having sum(prev.amount) = 0

Then you can query all rows that come after the rows with a zero running sum:

select a.*
from @articles a
left join (
    select cur.id, cur.art_id, running = sum(prev.amount)
    from @articles cur
    left join @articles prev
        on prev.art_id = cur.art_id
        and prev.ts <= cur.ts
    group by cur.id, cur.art_id
    having sum(prev.amount) = 0
) later_zero_running on
    a.art_id = later_zero_running.art_id
    and a.id <= later_zero_running.id
where later_zero_running.id is null

The LEFT JOIN in combination with the WHERE says: there can not be a row after this row, where the running sum is zero.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜