mysql duplicate data deletion
This shows me all the first names and last names that have exactly two entries that are identical
SELECT `firstname`,`lastname`,COUNT(*) AS Count
FROM `people`
GROUP BY `firstname`,`lastname`
HAVING Count = 2
How do I turn this into a DELETE FROM WHERE statement with a LIMIT to only remove one of each of the entries and leave the other one.
okay this appears to be way to technic开发者_如何学Goal i'm just going to do it in a php while loop
You can create a table with 1 record of each of the duplicates: Then delete all the dup records from the people table and then re-insert the dup records.
-- Setup for example
create table people (fname varchar(10), lname varchar(10));
insert into people values ('Bob', 'Newhart');
insert into people values ('Bob', 'Newhart');
insert into people values ('Bill', 'Cosby');
insert into people values ('Jim', 'Gaffigan');
insert into people values ('Jim', 'Gaffigan');
insert into people values ('Adam', 'Sandler');
-- Show table with duplicates
select * from people;
-- Create table with one version of each duplicate record
create table dups as
select distinct fname, lname, count(*)
from people group by fname, lname
having count(*) > 1;
-- Delete all matching duplicate records
delete people from people inner join dups
on people.fname = dups.fname AND
people.lname = dups.lname;
-- Insert single record of each dup back into table
insert into people select fname, lname from dups;
-- Show Fixed table
select * from people;
if you have a primary key, such as id, you can do:
delete from people
where id not in
(
select minid from
(select min(id) as minid from people
group by firstname, lastname) as newtable
)
The subquery select min(id)...
bit is getting you the unique (based on id) rows for a given firstname, lastname combination; and then you're deleting all other rows, i.e. your duplicates. You need to wrap your subquery due to a bug in mysql, otherwise we could do:
delete from people
where id not in
(
select min(id) as minid from people
group by firstname, lastname
)
better would be:
delete people from
people left outer join
(
select min(id) as minid from people
group by firstname, lastname
) people_grouped
on people.first_name = people_grouped.first_name
and people.last_name = people_grouped.last_name
and people_grouped.id is null
to avoid the subquery.
Create a new table and add a unique key on (firstname,lastname). Then insert the rows in old table into the new table. Then rename the tables.
mysql> select * from t;
+-----------+----------+
| firstname | lastname |
+-----------+----------+
| A | B |
| A | B |
| X | Y |
+-----------+----------+
3 rows in set (0.00 sec)
mysql> create table t2 like t;
Query OK, 0 rows affected (0.00 sec)
mysql> alter table t2 add unique key name(firstname,lastname);
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> insert ignore into t2 select * from t;
Query OK, 2 rows affected (0.00 sec)
Records: 3 Duplicates: 1 Warnings: 0
mysql> select * from t2;
+-----------+----------+
| firstname | lastname |
+-----------+----------+
| A | B |
| X | Y |
+-----------+----------+
2 rows in set (0.01 sec)
精彩评论