开发者

group by mysql option

I am writing a converter to transfer data from old systems to new systems. I am using php+mysql.

I have one table that contains millions records with duplicate entries. I want to transfer that data in a new table and remove all entries. I am using following queries and pseudo code to perform this task

select * 
from table1
insert into table2 
ON DUPLICATE KEY UPDATE customer_information = concat('$firstName',',','$lastName')

It takes ages to process one table :(

I am pondering that is it possible to use group by and get all grouped record automatically?

Other than going through each record and checking duplicate etc.?

For example

select * 
from table1 
group by firstName, lastName
insert into table 2 only one record and add all users' 
first last name into column ALL_NAMES with comma

EDIT

There are different records for each customers with different information. Each row is called duplicated if first and last name of user is same. In new table, we will just add one customer and their bought product in different columns (we have only 4 pr开发者_JS百科oducts).


I don't know what you are trying to do with customer_information, but if you just want to transfer the non-duplicated set of data from one table to another, this will work:

INSERT IGNORE INTO table2(field1, field2, ... fieldx)
  SELECT DISTINCT field1, field2, ... fieldx
    FROM table1;

DISTINCT will take care of rows that are exact duplicates. But if you have rows that are only partial duplicates (like the same last and first names but a different email) then IGNORE can help. If you put a unique index on table2(lastname,firstname) then IGNORE will make sure that only the first record with lastnameX, firstnameY from table1 is inserted. Of course, you might not like which record of a pair of partial duplicates is chosen.

ETA

Now that you've updated your question, it appears that you want to put the values of multiple rows into one field. This is, generally speaking, a bad idea because when you denormalize your data this way you make it much less accessible. Also, if you are grouping by (lastname, firstname), there will not be names in allnames. Because of this, my example uses allemails instead. In any event, if you really need to do this, here's how:

INSERT INTO table2(lastname, firstname, allemails)
  SELECT lastname, firstname, GROUP_CONCAT(email) as allemails
    FROM table1
    GROUP BY lastname, firstname;


If they are really duplicate rows (every field is the the same) then you can use:

select DISTINCT * from table1

instead of :

select * from table1
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜