mySql: count number of rows that have the same data in a column
I am trying to select everything in a table, and also count the number of rows in a table that have the same data.
SELECT *, COUNT(thedate) daycount FROM `table` ORDER BY thedate DESC
My hope is to have one query that outputs the date and number of rows associated with that date, and the looped output will be something like this:
Jan 1, 2000 (2 rows)
col1, col2, col3, col4 col1, col2, col3, col4Jan 1, 2000 (3 rows)
col1, c开发者_Go百科ol2, col3, col4 col1, col2, col3, col4 col1, col2, col3, col4Jan 1, 2000 (6 rows)
col1, col2, col3, col4 col1, col2, col3, col4 col1, col2, col3, col4 col1, col2, col3, col4 col1, col2, col3, col4 col1, col2, col3, col4
etc...
Does this make sense?
If you have a table that looks like this:
CREATE TABLE yourtable
(
datefield DATETIME,
col1 VARCHAR(20),
col2 INT NOT NULL,
col3 TINYINT NOT NULL,
col4 CHAR(5)
);
and you wanted the counts of duplicate col1.. col4 per given day, you would run this query
SELECT
COUNT(datefield) datefield_count,
LEFT(all_fields,10) datefield,
SUBSTR(all_fields,11) all_other_fields
FROM
(
SELECT
DATE(datefield) datefield,
CONCAT(DATE(datefield),'|',
COALESCE(col1,'< NULL >'),'|',
COALESCE(col2,'< NULL >'),'|',
COALESCE(col3,'< NULL >'),'|',
COALESCE(col4,'< NULL >'),'|') all_fields
FROM
yourtable
) A
GROUP BY all_fields;
Here is some sample data and the result of the query:
mysql> DROP TABLE IF EXISTS yourtable;
Query OK, 0 rows affected (0.04 sec)
mysql> CREATE TABLE yourtable
-> (
-> datefield DATETIME,
-> col1 VARCHAR(20),
-> col2 INT,
-> col3 TINYINT,
-> col4 CHAR(5)
-> );
Query OK, 0 rows affected (0.11 sec)
mysql> INSERT INTO yourtable VALUES
-> (DATE(NOW() - INTERVAL 1 DAY),'rolando',4,3 ,'angel'),
-> (DATE(NOW() - INTERVAL 1 DAY),'rolando',4,3 ,'angel'),
-> (DATE(NOW() - INTERVAL 1 DAY),'rolando',4,3 ,'angel'),
-> (DATE(NOW() - INTERVAL 1 DAY),'rolando',4,NULL,'angel'),
-> (DATE(NOW() - INTERVAL 1 DAY),'rolando',4,NULL,'angel'),
-> (DATE(NOW() - INTERVAL 2 DAY),'rolando',4,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 2 DAY),'rolando',4,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 2 DAY),'rolando',4,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 2 DAY),'rolando',4,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 2 DAY),'rolando',4,NULL,'edwards'),
-> (DATE(NOW() - INTERVAL 2 DAY),'rolando',4,NULL,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'rolando',5,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'rolando',5,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'rolando',4,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'pamela' ,4,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'pamela' ,4,NULL,'edwards'),
-> (DATE(NOW() - INTERVAL 3 DAY),'pamela' ,5,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'pamela' ,5,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'rolando',4,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'rolando',4,2 ,'angel'),
-> (DATE(NOW() - INTERVAL 3 DAY),'rolando',4,NULL,'edwards'),
-> (DATE(NOW() - INTERVAL 3 DAY),'rolando',4,NULL,'angel')
-> ;
Query OK, 22 rows affected, 3 warnings (0.03 sec)
Records: 22 Duplicates: 0 Warnings: 3
mysql> SELECT * FROM yourtable;
+---------------------+---------+------+------+-------+
| datefield | col1 | col2 | col3 | col4 |
+---------------------+---------+------+------+-------+
| 2011-06-30 00:00:00 | rolando | 4 | 3 | angel |
| 2011-06-30 00:00:00 | rolando | 4 | 3 | angel |
| 2011-06-30 00:00:00 | rolando | 4 | 3 | angel |
| 2011-06-30 00:00:00 | rolando | 4 | NULL | angel |
| 2011-06-30 00:00:00 | rolando | 4 | NULL | angel |
| 2011-06-29 00:00:00 | rolando | 4 | 2 | angel |
| 2011-06-29 00:00:00 | rolando | 4 | 2 | angel |
| 2011-06-29 00:00:00 | rolando | 4 | 2 | angel |
| 2011-06-29 00:00:00 | rolando | 4 | 2 | angel |
| 2011-06-29 00:00:00 | rolando | 4 | NULL | edwar |
| 2011-06-29 00:00:00 | rolando | 4 | NULL | angel |
| 2011-06-28 00:00:00 | rolando | 5 | 2 | angel |
| 2011-06-28 00:00:00 | rolando | 5 | 2 | angel |
| 2011-06-28 00:00:00 | rolando | 4 | 2 | angel |
| 2011-06-28 00:00:00 | pamela | 4 | 2 | angel |
| 2011-06-28 00:00:00 | pamela | 4 | NULL | edwar |
| 2011-06-28 00:00:00 | pamela | 5 | 2 | angel |
| 2011-06-28 00:00:00 | pamela | 5 | 2 | angel |
| 2011-06-28 00:00:00 | rolando | 4 | 2 | angel |
| 2011-06-28 00:00:00 | rolando | 4 | 2 | angel |
| 2011-06-28 00:00:00 | rolando | 4 | NULL | edwar |
| 2011-06-28 00:00:00 | rolando | 4 | NULL | angel |
+---------------------+---------+------+------+-------+
22 rows in set (0.00 sec)
mysql> SELECT
-> COUNT(datefield) datefield_count,
-> LEFT(all_fields,10) datefield,
-> SUBSTR(all_fields,11) all_other_fields
-> FROM
-> (
-> SELECT
-> DATE(datefield) datefield,
-> CONCAT(DATE(datefield),'|',
-> COALESCE(col1,'< NULL >'),'|',
-> COALESCE(col2,'< NULL >'),'|',
-> COALESCE(col3,'< NULL >'),'|',
-> COALESCE(col4,'< NULL >'),'|') all_fields
-> FROM
-> yourtable
-> ) A
-> GROUP BY all_fields;
+-----------------+------------+----------------------------+
| datefield_count | datefield | all_other_fields |
+-----------------+------------+----------------------------+
| 1 | 2011-06-28 | |pamela|4|2|angel| |
| 1 | 2011-06-28 | |pamela|4|< NULL >|edwar| |
| 2 | 2011-06-28 | |pamela|5|2|angel| |
| 3 | 2011-06-28 | |rolando|4|2|angel| |
| 1 | 2011-06-28 | |rolando|4|< NULL >|angel| |
| 1 | 2011-06-28 | |rolando|4|< NULL >|edwar| |
| 2 | 2011-06-28 | |rolando|5|2|angel| |
| 4 | 2011-06-29 | |rolando|4|2|angel| |
| 1 | 2011-06-29 | |rolando|4|< NULL >|angel| |
| 1 | 2011-06-29 | |rolando|4|< NULL >|edwar| |
| 3 | 2011-06-30 | |rolando|4|3|angel| |
| 2 | 2011-06-30 | |rolando|4|< NULL >|angel| |
+-----------------+------------+----------------------------+
12 rows in set (0.00 sec)
mysql>
I'll leave it to you imaginative creativity to loop through this and print
- datefield
- datefield_count
- print all_other_fields 'datefield_count' times
Give it a Try !!!
This is not what the OP have asked, rather what I have been looking for when I came upon this question. Perhaps some will find it useful.
select *
from thetable
join (
select thedate, count( thedate ) as cnt
from thetable
group by thedate
) as counts
using( thedate )
order by thedate
The above query will select everything with an extra field cnt containing the number of records having the same date. Then it is trivial to print something like:
some date, 2 records for this date
col1, col2, col3, col4
col1, col2, col3, col4
some other date, 3 records for this date
col1, col2, col3, col4
col1, col2, col3, col4
col1, col2, col3, col4
yet some other date, 1 record for this date
col1, col2, col3, col4
SELECT ...
FROM yourtable
GROUP BY DATE(datefield)
ORDER BY COUNT(DATE(datefield)) DESC
note that I'm using the DATE() function, in case your date fields are actually date-time. If you group on date time, it'd group by the fully yyyy-mm-dd hh:mm:ss, not just the yyyy-mm-dd and you'd get totally different results.
That'll get you the core results. Doing the output as you want it will need some post-processing in your script, but isn't too hard. Just buffer the found rows until the date changes, then output the buffer with a row count.
SELECT *, COUNT(thedate) daycount
FROM `table`
GROUP BY thedate
ORDER BY thedate DESC
SELECT thedate, COUNT(id)
FROM table
WHERE 1
GROUP BY thedate
ORDER BY thedate
精彩评论