Query for missing elements
I have a table with the following structure:
timestamp | name | value
0 | john | 5
1 | NULL | 3
8 | NULL | 12
12 | john | 3
33 | NULL | 4
54 | pete | 1
180 | NULL | 4
400 | john | 3
401 | NULL | 4
592 | anna | 2
Now what I am looking for is a query that will give me the sum of the values for each name, and treats the nulls in between (orderd by the timestamp) as the first non-null name down the list, as if the table were as follows:
timestamp | name | value
0 | john | 5
1 | john | 3
8 | john | 12
12 | john | 3
33 | pete | 4
54 | pete | 1
180 | john | 4
400 | john | 3
401 | anna | 4
592 | anna | 2
and I would query SUM(value), name from this table group by name
. I have thought and tried, but I can't come up with a proper solution. I have looked at recursive common table expressions, and think the answer may lie in there, but I haven't been able to properly understand those.
These tables are just examples, and I don't know the timestamp v开发者_高级运维alues in advance.
Could someone give me a hand? Help would be very much appreciated.
With Inputs As
(
Select 0 As [timestamp], 'john' As Name, 5 As value
Union All Select 1, NULL, 3
Union All Select 8, NULL, 12
Union All Select 12, 'john', 3
Union All Select 33, NULL, 4
Union All Select 54, 'pete', 1
Union All Select 180, NULL, 4
Union All Select 400, 'john', 3
Union All Select 401, NULL, 4
Union All Select 592, 'anna', 2
)
, NamedInputs As
(
Select I.timestamp
, Coalesce (I.Name
, (
Select I3.Name
From Inputs As I3
Where I3.timestamp = (
Select Max(I2.timestamp)
From Inputs As I2
Where I2.timestamp < I.timestamp
And I2.Name Is not Null
)
)) As name
, I.value
From Inputs As I
)
Select NI.name, Sum(NI.Value) As Total
From NamedInputs As NI
Group By NI.name
Btw, what would be orders of magnitude faster than any query would be to first correct the data. I.e., update the name column to have the proper value, make it non-nullable and then run a simple Group By to get your totals.
Additional Solution
Select Coalesce(I.Name, I2.Name), Sum(I.value) As Total
From Inputs As I
Left Join (
Select I1.timestamp, MAX(I2.Timestamp) As LastNameTimestamp
From Inputs As I1
Left Join Inputs As I2
On I2.timestamp < I1.timestamp
And I2.Name Is Not Null
Group By I1.timestamp
) As Z
On Z.timestamp = I.timestamp
Left Join Inputs As I2
On I2.timestamp = Z.LastNameTimestamp
Group By Coalesce(I.Name, I2.Name)
You don't need CTE, just a simple subquery.
select t.timestamp, ISNULL(t.name, (
select top(1) i.name
from inputs i
where i.timestamp < t.timestamp
and i.name is not null
order by i.timestamp desc
)), t.value
from inputs t
And summing from here
select name, SUM(value) as totalValue
from
(
select t.timestamp, ISNULL(t.name, (
select top(1) i.name
from inputs i
where i.timestamp < t.timestamp
and i.name is not null
order by i.timestamp desc
)) as name, t.value
from inputs t
) N
group by name
I hope I'm not going to be embarassed by offering you this little recursive CTE query of mine as a solution to your problem.
;WITH
numbered_table AS (
SELECT
timestamp, name, value,
rownum = ROW_NUMBER() OVER (ORDER BY timestamp)
FROM your_table
),
filled_table AS (
SELECT
timestamp,
name,
value
FROM numbered_table
WHERE rownum = 1
UNION ALL
SELECT
nt.timestamp,
name = ISNULL(nt.name, ft.name),
nt.value
FROM numbered_table nt
INNER JOIN filled_table ft ON nt.rownum = ft.rownum + 1
)
SELECT *
FROM filled_table
/* or go ahead aggregating instead */
精彩评论