Getting almost distinct rows

2023-02-07 17:07 问答作者：

I have a table like this

C1             C2            C3
Mike           London        578
Mike           Bonn          578
Jane           Madrid        245
Billy          Paris         345
Jane           Rome          245

And I nee开发者_开发问答d a query that gives me:

C1             C2            C3
Mike           London        578
Jane           Madrid        245
Billy          Paris         345

That is, a query that gives me something like a distinct on C1, ignoring the next occurrences of the same value on C1.

EDIT: Please excuse me, this was just a quick sample and somewhat it seems to induce some of you to think that C3 matters, I'm editing it to make it look more like the real table, which has about 50 columns, and the problematic rows all identic except for a value, which can be discarded.

If you don't care from which record the data is, you could just write it as:

SELECT C1, min(C2), min(C3)
FROM table
GROUP BY C1

The problem here is that min(C2) and min(C3) could actually mix data from different records.

If you had a primary key, you could avoid id easily:

SELECT C1, C2, C3
FROM table t
WHERE id IN (
  SELECT min(t2.id) 
  FROM table t2
  GROUP BY t2.C1)

There is really no such simple concept as "next occurrences" in SQL because the sets/relations are by default unordered. You must explicitly state how the rows are to be ordered with an ORDER BY clause and then select from that ordered relation the row or rows you want (using TOP in SQL Server 2000). You don't appear to be sorting by C3 descending (since Jane has a 346 and you want her 245). What tacit order-by is implicit in your word "next" (i.e. you want the first row per distinct person) ? How do you wish to define first in this query? Do you want each person's lowest C3 value? If so you could group by person taking the min(c3) in an inline view and join that inline view to another inline view where you have selected the distinct C1.

Use the RANK() OVER PARITION like this (2005, 2008):

declare @table  as table (c1 nvarchar(10), c2 nvarchar(10), c3 int, id int identity(1,1))

insert into @table values
('Mike',           'London',        578),
('Mike',           'Bonn',          234),
('Jane',           'Madrid',        245),
('Billy',          'Paris',         345),
('Jane',           'Rome',          346)

select c1, c2, c3 from 
( select id, c1, c2, c3, RANK() over ( partition by c1 order by id) as Rank from @table) tmp
where tmp.Rank = 1
order by id

Use an interesting WHERE-clause like this (2000):

select t2.*
from
@table t2
where 
(select COUNT(*) from @table t where t.c1=t2.c1 and (t2.c2 > t.c2 or (t2.c2 = t.c2 and t2.c3 > t.c3))) = 1
union
select * from @table where c1 in (select c1 from @table group by c1 having COUNT(*) = 1)

The ordering is different from above, but you'll have to sort that out in your real world data.

I am more than likely being silly but distinct takes the distinct combination of all the columns in the select.

To acheive this you would need more data, something to determine which row came first.

Here is some code that i whipped...

DECLARE @TBL TABLE(

    ID INT IDENTITY(1,1),
    C1 VARCHAR(100),
    C2 VARCHAR(100),
    C3 INT

)
INSERT INTO @TBL VALUES ('Mike','London',578)
INSERT INTO @TBL VALUES ('Mike','Bonn',234)
INSERT INTO @TBL VALUES ('Jane','Madrid',245)
INSERT INTO @TBL VALUES ('Billy','Paris',345)
INSERT INTO @TBL VALUES ('Jane','Rome',346)

SELECT * FROM @TBL T
WHERE
ID = (SELECT MIN(ID) FROM @TBL CHILD WHERE CHILD.C1 = T.C1)
GROUP BY ID,C1,C2,C3

Hope this helps?

继续阅读：sql sql-server-2000

Getting almost distinct rows

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？