Finding "duplicate" rows that differ in one column
I have a table like the following in MySQL 5.1:
+--------------+----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------开发者_运维问答--+----------------+------+-----+---------+----------------+
| log_id | int(11) | NO | PRI | NULL | auto_increment |
| date | datetime | NO | MUL | NULL | |
| date_millis | int(3) | NO | | NULL | |
| eib_address | varchar(20) | NO | | NULL | |
| ip_address | varchar(15) | NO | | NULL | |
| value | decimal(20,10) | NO | MUL | NULL | |
| application | tinyint(4) | NO | | NULL | |
| phys_address | varchar(20) | NO | | NULL | |
| orig_log_id | bigint(20) | NO | | NULL | |
+--------------+----------------+------+-----+---------+----------------+
In this table, log_id
and orig_log_id
are always unique. It is possible that two rows may have duplicate values for any of the other fields, though. Ignoring the *log_id
fields, our problem is that two rows may be identical in all other columns, but have differing values for value
. I am trying to figure out the correct SQL query to identify when two (or more) rows have identical values for date
, date_millis
and eib_address
, but different values for value
, log_id
and orig_log_id
. So far, I've been able to come up with a query that accomplishes the first clause in my previous sentence:
SELECT main.*
FROM sensors_log main
INNER JOIN
(SELECT date, date_millis, eib_address
FROM sensors_log
GROUP BY date, date_millis, eib_address
HAVING count(eib_address) > 1) dupes
ON main.date = dupes.date
AND main.date_millis = dupes.date_millis
AND main.eib_address = dupes.eib_address;
However, I can't seem to figure out when value
differs. I at least know that just throwing AND main.value != dupes.value
into the ON
clause doesn't do it!
I think it's a bit simpler than you're trying to make it. Try this:
SELECT *
FROM SENSORS_LOG s1
INNER JOIN SENSORS_LOG s2
ON (s2.DATE = s1.DATE AND
s2.DATE_MILLIS = s1.DATE_MILLIS AND
s2.EIB_ADDRESS = s1.EIB_ADDRESS)
WHERE s1.VALUE <> s2.VALUE OR
s1.LOG_ID <> s2.LOG_ID OR
s1.ORIG_LOG_ID <> s2.ORIG_LOG_ID;
Share and enjoy.
Maybe I mistook the problem, but can't you just perform a COUNT
like this?
SELECT date, date_millis, eib_address, count(*) as nr_dupes
FROM sensors_log
GROUP BY date, date_millis, eib_address
HAVING count(*) > 1
or
SELECT date, date_millis, eib_address,
group_concat(value), group_concat(log_id), group_concat(orig_log_id)
FROM sensors_log
GROUP BY date, date_millis, eib_address
HAVING count(*) > 1
精彩评论