Combined aggregates into a single SQL table
We have a table which is a bit like this:
Year Episode Code
2000 1 A001
2000 1 A001
2000 1 C007
2000 2 A001
2000 2 B001
It's referencing another table, in which the combination of Year
and Episode
is unique, but this table I'm working 开发者_如何学编程with just lists the selection of codes applied to each episode.
What I'm trying to do is create a table that returns a per-year count of total episodes and episodes that have a particular code. I can't just do a simple "COUNT(*)" of the code, because one episode may have the same code multiple times.
Can anyone see a viable way to do what I am attempting?
This might be what you are after. You need at least SQL Server 2005 for the pivot function.
create table MyTable (
[Year] datetime,
Episode int,
Code nvarchar(20)
)
insert into MyTable values ('01-01-2000', 1, 'A001')
insert into MyTable values ('01-01-2000', 1, 'A001')
insert into MyTable values ('01-01-2000', 1, 'C007')
insert into MyTable values ('01-01-2000', 2, 'A001')
insert into MyTable values ('01-01-2000', 2, 'B001')
insert into MyTable values ('01-01-2000', 2, 'B001')
insert into MyTable values ('01-01-2001', 1, 'A001')
insert into MyTable values ('01-01-2002', 1, 'A001')
insert into MyTable values ('01-01-2003', 1, 'C007')
select [Code], [2000], [2001], [2002]
from (
select Code,
DATEPART(year, [Year]) as date,
count(Episode) as instances
from MyTable
group by DATEPART(year, [year]), code) as o
pivot
(
sum(instances) for date in ([2000], [2001], [2002])
) as p
By “create a table” I'm going to assume you know how to create a table from a result set. So this reduces to: What query will return a result set with a per-year count of episodes.
You cite duplicate tuples as a problem. You're right, they are a problem; why are they there? What is the semantic difference between one tuple of (2000, 1, 'A001')
versus three identical ones?
The DISTINCT
clause is designed to strip those; I recommend using it any time you don't know an explicit good reason to keep duplicates in the result set.
So if duplicates aren't meaningful, and a result set is sufficient for you to progress, then:
SELECT DISTINCT
year,
COUNT(episode) AS episode_count
FROM (
SELECT DISTINCT
year,
episode
FROM episode_code
WHERE
code = 'A001'
) AS episode_for_code
GROUP BY year
will give a count of episodes by year, for episode_code
tuples with a specified code.
Once you have that, I'd investigate why you have duplicates in episode_code
at all, and remove them unless you get a good answer.
精彩评论