postgresql: Finding the ids of rows that contain case-insensitive string duplication
I want to select and then delete a list of entries in my tables that have case-insensitive duplications.
In other words, there are these rows that are unique... ..bu开发者_开发问答t they're not unique if you ignore case factor in case. They got in while I wasn't watching.
So how can I select against the column to find the ids that I should delete? (I'm fine with deleting both duplications).
simple sample column structure:
player_id | uname
------------------
34 | BOB
544 | bob
etc...
Players to keep (assuming they registered first)
SELECT min(player_id) as player_id
FROM players
GROUP BY lower(uname)
Use it to dislay the users to remove and their corresponding keeper.
SELECT
players.player_id delete_id,
players.uname delete_uname,
keepers.uname keeper_uname,
keepers.player_id keeper_id
FROM players JOIN
(
SELECT p.player_id, p.uname
FROM players p JOIN
(
SELECT min(player_id) player_id
FROM players
GROUP BY lower(uname)
) as keeper_ids
ON (p.player_id = keeper_ids.player_id)
) as keepers
ON (lower(players.uname) = lower(keepers.uname) AND players.player_id <> keepers.player_id)
ORDER BY keepers.player_id, players.player_id
Output:
delete_id | delete_uname | keeper_uname | keeper_id
---------------------------------------------------
544 | bob | BOB | 34
精彩评论