group by mysql option
I am writing a converter to transfer data from old systems to new systems. I am using php+mysql.
I have one table that contains millions records with duplicate entries. I want to transfer that data in a new table and remove all entries. I am using following queries and pseudo code to perform this task
select *
from table1
insert into table2
ON DUPLICATE KEY UPDATE customer_information = concat('$firstName',',','$lastName')
It takes ages to process one table :(
I am pondering that is it possible to use group by and get all grouped record automatically?
Other than going through each record and checking duplicate etc.?
For example
select *
from table1
group by firstName, lastName
insert into table 2 only one record and add all users'
first last name into column ALL_NAMES with comma
EDIT
There are different records for each customers with different information. Each row is called duplicated if first and last name of user is same. In new table, we will just add one customer and their bought product in different columns (we have only 4 pr开发者_JS百科oducts).
I don't know what you are trying to do with customer_information, but if you just want to transfer the non-duplicated set of data from one table to another, this will work:
INSERT IGNORE INTO table2(field1, field2, ... fieldx)
SELECT DISTINCT field1, field2, ... fieldx
FROM table1;
DISTINCT
will take care of rows that are exact duplicates. But if you have rows that are only partial duplicates (like the same last and first names but a different email) then IGNORE
can help. If you put a unique index on table2(lastname,firstname) then IGNORE will make sure that only the first record with lastnameX, firstnameY from table1 is inserted. Of course, you might not like which record of a pair of partial duplicates is chosen.
ETA
Now that you've updated your question, it appears that you want to put the values of multiple rows into one field. This is, generally speaking, a bad idea because when you denormalize your data this way you make it much less accessible. Also, if you are grouping by (lastname, firstname), there will not be names in allnames. Because of this, my example uses allemails instead. In any event, if you really need to do this, here's how:
INSERT INTO table2(lastname, firstname, allemails)
SELECT lastname, firstname, GROUP_CONCAT(email) as allemails
FROM table1
GROUP BY lastname, firstname;
If they are really duplicate rows (every field is the the same) then you can use:
select DISTINCT * from table1
instead of :
select * from table1
精彩评论