How do I calculate a running SUM?

2023-02-23 00:19 问答作者：

How do I get a column that is the sum of the preceding values of another开发者_如何学Go column?

As of SQLite 3.25.0, since 2018-09-15, window functions and their keyword OVER are supported. The answer to your question is now easy:

SELECT Country, Gdp, SUM(Gdp) OVER (ROWS UNBOUNDED PRECEDING)
FROM CountryGdp;

This is the minimal query that does what you request, but it doesn't define any ordering so here is a more proper way of doing it.

SELECT
    Country,
    Gdp,
    SUM(Gdp) OVER (
        ORDER BY Country -- Window ordering (not necessarily the same as result ordering!)
        ROWS BETWEEN -- Window for the SUM includes these rows:
            UNBOUNDED PRECEDING -- all rows before current one in window ordering
            AND CURRENT ROW -- up to and including current row.
        ) AS RunningTotal
FROM CountryGdp
ORDER BY Country;

In any way, the query should run in O(N) time.

You can do it by joining the table with itself (performing a so-called Cartesian or cross join). See the following example.

SELECT a.name, a.gdppc, SUM(b.gdppc)
FROM gdppc AS a, gdppc AS b WHERE b.gdppc <= a.gdppc 
GROUP BY b.id ORDER BY a.gdppc;

Given a table containing countries and their per capita GDP it will give you a running total of the GDP figure.

Democratic Republic of Congo|329.645|329.645
Zimbabwe|370.465|700.11
Liberia|385.417|1085.527
Burundi|399.657|1485.184
Eritrea|678.954|2164.138
Niger|711.877|2876.015
Central African Republic|743.945|3619.96
Sierra Leone|781.594|4401.554
Togo|833.803|5235.357
Malawi|867.063|6102.42
Mozambique|932.511|7034.931
...

Note that this can be a very resource-intensive operation, because if a table has N elements it will create a temporary table with N*N elements. I would not perform it on a large table.

Cross join solutions like Diomidis Spinellis suggested take O(N^2) time. A recursive CTE can work faster, if you can stomach the convoluted code.

This produces the same output as his.

WITH RECURSIVE running(id, name, gdppc, rt) AS (
    SELECT row1._rowid_, row1.name, row1.gdppc, COALESCE(row1.gdppc,0)
    FROM gdppc AS row1
    WHERE row1._rowid_ = (
        SELECT a._rowid_
        FROM gdppc AS a
        ORDER BY a.gdppc, a.name, a._rowid_
        LIMIT 1)
    UNION ALL
    SELECT row_n._rowid_, row_n.name, row_n.gdppc, COALESCE(row_n.gdppc,0)+running.rt
    FROM gdppc AS row_n INNER JOIN running
    ON row_n._rowid_ = (
        SELECT a._rowid_
        FROM gdppc AS a
        WHERE (a.gdppc, a.name, a._rowid_) > (running.gdppc, running.name, running.id)
        ORDER BY a.gdppc, a.name, a._rowid_
        LIMIT 1))
SELECT running.name, running.gdppc, running.rt
FROM running;

Ordering and comparisons take care of duplicates, COALESCE is there to ignore NULLs.

If you have a good index, this should be O(N log N). Since SQLite doesn't support cursors, an O(N) solution probably doesn't exist without relying on an external application.

If you have a version of SQLite that doesn't support OVER here is another approach using recursion against a group_concat string of rows.
On SQLite version 3.22.0 2018-01-22 18:45:57 group_concat returns rows in database order. Create a common table expression and order it for different orders as in table work1 in the example.

/* cumulative running total using group_concat and recursion
   adapted from https://blog.expensify.com/2015/09/25/the-simplest-sqlite-common-table-expression-tutorial/
*/

WITH RECURSIVE work2 AS (
  SELECT NULL AS name, NULL AS gdppc, 0 AS cum, (select group_concat(name) from work1) AS gcname, (select group_concat(gdppc) from work1) AS gcgdppc 
  UNION
        SELECT
            CASE
                WHEN INSTR(gcname, ',' )>0 THEN 
                    SUBSTR(gcname, 0, INSTR(gcname,','))
                ELSE
                    gcname
            END,
            CASE
                WHEN INSTR(gcgdppc, ',' )>0 THEN 
                    SUBSTR(gcgdppc, 0, INSTR(gcgdppc,','))
                ELSE
                    gcgdppc
            END,
            CASE
                WHEN INSTR(gcgdppc, ',' )>0 THEN 
                    cum + SUBSTR(gcgdppc, 0, INSTR(gcgdppc,','))
                ELSE
                    cum + gcgdppc
            END,
            CASE
                WHEN INSTR( gcname, ',' )>0 THEN 
                    SUBSTR( gcname, INSTR( gcname, ',' )+1 )
                ELSE
                    NULL
            END,
            CASE
                WHEN INSTR(gcgdppc, ',' )>0 THEN 
                    SUBSTR( gcgdppc, INSTR( gcgdppc, ',' )+1 )
                ELSE
                    NULL
            END
        FROM work2
        WHERE gcgdppc IS NOT NULL

  ), 
/* SQLite version 3.22.0 2018-01-22 18:45:57
   group_concat ignores ORDER BY when specified against the base table
   but does appear to follow the order of a common table expression 
*/
  work1 AS (select * from gdppc order by gdppc),

  gdppc AS (SELECT 'Burundi' AS name,399.657 AS gdppc
            UNION
            SELECT 'Democratic Republic of Congo', 329.645
            UNION
            SELECT 'Liberia',385.417
            UNION
            SELECT 'Zimbabwe',370.465)

select name,gdppc,cum from work2 where name IS NOT NULL;
/* result
Democratic Republic of Congo|329.645|329.645
Zimbabwe|370.465|700.11
Liberia|385.417|1085.527
Burundi|399.657|1485.184
*/

Since this post is the top Google result for "sqlite rolling sum", I'll add the below answer.

The top answer by relatively_random works great for cumulative sum. However, if you want to adjust it to a rolling sum (perhaps a 7-day rolling sum), you can do so by adding a CASE expression to check if the window has the required size (e.g. 7).¹

This is necessary(?) because normally, rolling computation returns NULL if the rolling window size is smaller than it's supposed to be. For example, if we're computing the 7-day rolling sums, the first 6 rows should have NULL total because it's not been 7 days yet.

SELECT
    date,
    price,
    CASE
        WHEN COUNT(*) OVER win >= 7 THEN SUM(price) OVER win
    END AS seven_day_rolling_total
FROM price_data
WINDOW win AS (
    ORDER BY date
    ROWS 6 PRECEDING
)
--LIMIT 10;

^{1: To make the query a little readable, I defined the window after the SELECT statement because it is used twice.}

You have to do a sum in the field you want.... The query depends on the database you're using, Oracle allows you to do this:

select id, value, sum(value) as partial_sum over (order by id) from table

继续阅读：aggregate-functions rolling-computation sql

How do I calculate a running SUM?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？