MYSQL: Avoiding cartesian product of repeating records when self-joining
There are two tables: table A and table B. They have the same columns and the data is practically identical. They both have auto-incremented IDs, the only difference between the two is that they have different IDs for the same records.
Among the columns, there is an IDENTIFIER column which is not unique, i.e. there are (very few) records with the same IDENTIFIER in both tables.
Now, in order to find a correspondence between the IDs of table A and the IDs of table B, I have to join these two tables (for all purposes it's a self-join) on the IDENTIFIER column, something like:
SELECT A.ID, B.ID
FROM A INNER JOIN B ON A.IDENTIFIER = B.IDENTIFIER
But, being IDENTIFIER non-unique, this generates every possible combination of开发者_StackOverflow社区 the repeating values of IDENTIFIER, I don't want that.
Ideally, I would like to generate an one to one association between IDs that have repeating IDENTIFIER values, based on their order. For example, supposing that there are six records with different ID and the same IDENTIFIER value in table A (and thus in table B):
A B
IDENTIFIER:'ident105', ID:10 -> IDENTIFIER:'ident105', ID:3
IDENTIFIER:'ident105', ID:20 -> IDENTIFIER:'ident105', ID:400
IDENTIFIER:'ident105', ID:23 -> IDENTIFIER:'ident105', ID:420
IDENTIFIER:'ident105', ID:100 -> IDENTIFIER:'ident105', ID:512
IDENTIFIER:'ident105', ID:120 -> IDENTIFIER:'ident105', ID:513
IDENTIFIER:'ident105', ID:300 -> IDENTIFIER:'ident105', ID:798
That would be ideal. Anyway, a way to generate a one to one association regardless of the order of the IDs would still be ok (but not preferred).
Thanks for your time,
Silvio
select a_numbered.id, a_numbered.identifier, b_numbered.id from
(
select a.*,
case
when @identifier = a.identifier then @rownum := @rownum + 1
else @rownum := 1
end as rn,
@identifier := a.identifier
from a
join (select @rownum := 0, @identifier := null) r
order by a.identifier
) a_numbered join (
select b.*,
case
when @identifier = b.identifier then @rownum := @rownum + 1
else @rownum := 1
end as rn,
@identifier := b.identifier
from b
join (select @rownum := 0, @identifier := null) r
order by b.identifier
) b_numbered
on a_numbered.rn=b_numbered.rn and a_numbered.identifier=b_numbered.identifier
精彩评论