SQL Multiple Duplicate Row Detection

2023-04-11 16:02 问答作者：

I'm trying to determine a correct way to isolate rows within a table that have the same values in 2 columns.

There are two tables, one (Name) with the person's names and IDs, and the other one (Nation) with people's IDs and their nations. I join the two tables with inner join, and now the new table columns consist of an ID, first name, last name, and nation. If I want to find pairs of people who have the same last name and are from the same nation, why isn't

select ID, FName, LName, Nation
from (Name inner join Nation on Name.ID = Nation.ID)
group by Name, Nation
having count(Name) > 1 and count(Nation) > 1

working?

I'm aiming for the result to be a table with columns:

ID -------First--------------- Last ---------Nation

where the last 开发者_StackOverflow中文版names and nations will be identical pairs while first names will be different.

I feel like the group by part isnt appropriate, but is there even an alternate way? Thanks for any help.

If you are using MS SQL Server:

select
    *
from
(
    select 
        Name.*, 
        Nation.Nation, 
        cnt = count(*) over(partition by LName, Nation) 
    from Name
    join Nation on Nation.ID = Name.ID
) t
where cnt > 1

Try this:

SELECT * FROM (
  SELECT Name.ID, Name.FName, Name.LName, Nation.Nation
  FROM Name
  INNER JOIN Nation ON (Name.ID = Nation.ID)
) a
INNER JOIN (
  SELECT Name.ID, Name.FName, Name.LName, Nation.Nation
  FROM Name
  INNER JOIN Nation ON (Name.ID = Nation.ID)
) b ON (a.LName = b.LName AND a.Nation = b.Nation)
WHERE a.ID < b.ID

As Simon Righarts hinted, something's not right with the design.

Scenario 1)

If a name can have multiple nations, you would have 3 tables implementing an n:m relationship.

CREATE TABLE name (name_id int, name text, ...);
CREATE TABLE nation (nation_id int, nation text, ...);
CREATE TABLE nationality (name_id int references name(name_id)
            ,nation_id int references nation(nation_id)
            ... );

Query for the scenario:

SELECT a.name_id, a.fname, a.lname, n.nation
  FROM name a
  JOIN nationality na USING (name_id)
  JOIN nation n USING (nation_id)
  JOIN (
   SELECT a.lname, na.nation_id
     FROM name a
     JOIN nationality na USING (name_id)
    GROUP BY 1,2
   HAVING count(*) > 1) x USING (lname, nation_id)

Scenario 2)

If a name can only have one nation, there would be a column nation_id in the table name:

CREATE TABLE name (name_id int
                  ,name text
                  ,nation_id int references nation(nation_id), ...);
CREATE TABLE nation (nation_id int, nation text, ...);

Query for this scenario:

SELECT a.name_id, a.fname, a.lname, n.nation
  FROM name a
  JOIN nation n USING (nation_id)
  JOIN (
   SELECT a.lname, a.nation_id
     FROM name a
    GROUP BY 1,2
   HAVING count(*) > 1) x USING (lname, nation_id);

All multiple occurrences are included here, not just "pairs" - assuming you meant that.

Your actual description doesn't fit either scenario.

继续阅读：duplicates sql

SQL Multiple Duplicate Row Detection

Scenario 1)

Scenario 2)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Scenario 1)

Scenario 2)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？