开发者

MYSQL: Avoiding cartesian product of repeating records when self-joining

There are two tables: table A and table B. They have the same columns and the data is practically identical. They both have auto-incremented IDs, the only difference between the two is that they have different IDs for the same records.

Among the columns, there is an IDENTIFIER column which is not unique, i.e. there are (very few) records with the same IDENTIFIER in both tables.

Now, in order to find a correspondence between the IDs of table A and the IDs of table B, I have to join these two tables (for all purposes it's a self-join) on the IDENTIFIER column, something like:

SELECT A.ID, B.ID
FROM A INNER JOIN B ON A.IDENTIFIER = B.IDENTIFIER

But, being IDENTIFIER non-unique, this generates every possible combination of开发者_StackOverflow社区 the repeating values of IDENTIFIER, I don't want that.

Ideally, I would like to generate an one to one association between IDs that have repeating IDENTIFIER values, based on their order. For example, supposing that there are six records with different ID and the same IDENTIFIER value in table A (and thus in table B):

A                                 B
IDENTIFIER:'ident105', ID:10  ->  IDENTIFIER:'ident105', ID:3
IDENTIFIER:'ident105', ID:20  ->  IDENTIFIER:'ident105', ID:400
IDENTIFIER:'ident105', ID:23  ->  IDENTIFIER:'ident105', ID:420
IDENTIFIER:'ident105', ID:100 ->  IDENTIFIER:'ident105', ID:512
IDENTIFIER:'ident105', ID:120 ->  IDENTIFIER:'ident105', ID:513
IDENTIFIER:'ident105', ID:300 ->  IDENTIFIER:'ident105', ID:798

That would be ideal. Anyway, a way to generate a one to one association regardless of the order of the IDs would still be ok (but not preferred).

Thanks for your time,

Silvio


select a_numbered.id, a_numbered.identifier, b_numbered.id from 
(
select a.*,
       case 
          when @identifier = a.identifier then @rownum := @rownum + 1
          else @rownum := 1
       end as rn,
       @identifier := a.identifier
  from a
  join (select @rownum := 0, @identifier := null) r
order by a.identifier

) a_numbered join (
select b.*,
       case 
          when @identifier = b.identifier then @rownum := @rownum + 1
          else @rownum := 1
       end as rn,
       @identifier := b.identifier
  from b
  join (select @rownum := 0, @identifier := null) r
order by b.identifier

) b_numbered 
on a_numbered.rn=b_numbered.rn and a_numbered.identifier=b_numbered.identifier
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜