Counting groups of data in SQL
I have a table of about 1.6M rows where the records have a unique (identity) ID and then also have a 8-character "code" field. The code field it used to group data into sets...all rows with the same code are in the same set. Each set should consist of 12 records, but it seems our data load was off and some sets are incomplete.
I need to identify the incomplete sets. How would I write a query to select out only those records that are part of a set that do NOT have the proper nu开发者_如何学运维mber of records in their set (12)?
I am using MS SQL 2008.
Thanks.
Try this one...
SELECT
CodeField,
COUNT(CodeField)
FROM table
GROUP BY CodeField
HAVING COUNT(CodeField) < 12
Use this to get codes with less than 12 records in the table:
SELECT yt.code
FROM YOUR_TABLE yt
GROUP BY yt.code
HAVING COUNT(*) < 12
For this exercise, you have to use the HAVING
clause - the WHERE
clause doesn't allow you to use aggregate functions (MIN, MAX, COUNT...) without them being in a subquery, but the HAVING
clause doesn't have that restriction.
To get codes with both less or more than 12 records in the table, use:
SELECT yt.code
FROM YOUR_TABLE yt
GROUP BY yt.code
HAVING COUNT(*) != 12
Nitpick: !=
is ANSI-92, vs <>
This will give you the individual records that need to be updated:
SELECT *
FROM MyTable
WHERE CodeField IN
(
SELECT CodeField,
COUNT(CodeField) AS [Count]
FROM MyTable
GROUP BY CodeField
HAVING COUNT(CodeField) <> 12
)
This is a perfect case for using the having
clause:
select codefield, count(*) from YOURTABLE
group by codefield
having count(*) <> 12
The HAVING clause allows you to restrict the results of a query that includes a GROUP BY
to just those records that match a certain condition. You can think of it as a variation on a WHERE
clause that works with grouped result sets.
A query can include both a WHERE
and a HAVING
clause - however, realize that the predicate in the WHERE
clause is evaluated before results are grouped, while the predicate in the HAVING
clause is evaluated after.
Note that I have used <>
rather than <
. This will identify codes that may have too many items as well as codes that have too few.
select code, count(*) as count
from MyTable
group by code
having count(*) <> 12
This should work:
select CodeField, count(*)
from yourtable
group by CodeField
having count(*) <> 12
You could accomplish this with a group by. Something like the following:
select
groups.code,
groups.code_count
from
(select
code,
count(code) code_count
from
records_table
group by code) groups
where
groups.code_count != 12
Enjoy!
精彩评论