开发者

Merging related data when a column should become unique

Given a mysql-database with tables as follows:

author:
+----+----------+
| id | name     |
+----+----------+
| 1  | John     |
| 2  | Peter    |
| 3  | Peter    |
+----+----------+

article:
+----+-----------+------+
| id | author_id | text |
+----+-----------+------+
| 1  | 2         | ...  |
| 2  | 3         | ...  |
| 3  | 3      开发者_开发技巧   | ...  |
+----+-----------+------+

The author-table's name-column wasn't set to unique by accident. Now I have to "merge" related articles into one of the related authors, i.e. set author_id of articles 2 and 3 to 2. I want to make the name-column unique afterwards.

I cannot reassign the articles manually, because there are too many affected records. But I thought there may be a ready solution / snippet for this problem.


To update your article table, this will do the trick:

update article art
   set art.author_id = (select min(aut.id)
                          from author aut
                         where aut.name = (select a.name
                                             from author a
                                            where a.id = art.author_id));

select * from article;    
+ ------- + -------------- + --------- +
| id      | author_id      | text      |
+ ------- + -------------- + --------- +
| 1       | 2              |           |
| 2       | 2              |           |
| 3       | 2              |           |
+ ------- + -------------- + --------- +
3 rows

if you prefer a more compact update (and more optimized), then you can use this one, that works the same way:

update article art
   set art.author_id = (select min(aut.id)
                          from author aut
                         inner join author a on a.name = aut.name
                         where a.id = art.author_id);

Finally, to delete the extra authors, you need

delete a
  from author a
 inner join (
    select name, min(id) as min -- this subquery returns all repeated names and their smallest id
      from author
     group by name
    having count(*) > 1) repeated on repeated.name = a.name
 where a.id > repeated.min;     -- delete all repeateds except the first one

select * from author;    
+ ------- + --------- +
| id      | name      |
+ ------- + --------- +
| 1       | John      |
| 2       | Peter     |
+ ------- + --------- +
2 rows

This works for any number of repeated sets of authors.

Hope this helps


You can do an update article first to use the lowest author id having the same name

UPDATE art SET art.author_id =
    (SELECT MIN(a1.id) FROM author a1 WHERE a1.Name = a2.name
        FROM article art INNER JOIN author a2 ON art.author_id = a2.id)

Then delete the higher author having the same name

PS. I have not tested the SQL but should work.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜