SELECT DISTINCT for data groups

2023-01-27 02:42 问答作者：

I have following table:

ID  Data
1   A
2   A
2   B
3   A
3   B
4   C
5   D 
6   A
6   B

etc. In other words, I have groups of data per ID. You will notice that the data group (A, B) occurs multiple times. I want a query that can identify the distinct data groups and number them, such as:

DataID     Data
101        A
102        A
102        B
103        C
104        D

So DataID 102 would resemble data (A,B), DataID 103 would resemble data (C), etc. In order to be able to rewrite my original table in this form:

ID   DataID
1    101
2    102
3    102
4    103
5    104
6    102

How can I do that?

PS. Code to generate the first table:

CREATE TABLE #t1 (id INT, data VARCHAR(10))
INSERT INTO #t1
SELECT 1, 'A'
UNION ALL SELECT 2, '开发者_如何学JAVAA'
UNION ALL SELECT 2, 'B'
UNION ALL SELECT 3, 'A'
UNION ALL SELECT 3, 'B'
UNION ALL SELECT 4, 'C'
UNION ALL SELECT 5, 'D'
UNION ALL SELECT 6, 'A'
UNION ALL SELECT 6, 'B'

In my opinion You have to create a custom aggregate that concatenates data (in case of strings CLR approach is recommended for perf reasons). Then I would group by ID and select distinct from the grouping, adding a row_number()function or add a dense_rank() your choice. Anyway it should look like this

with groupings as (
select concat(data) groups
from Table1
group by ID
)
select groups, rownumber() over () from groupings

The following query using CASE will give you the result shown below.

From there on, getting the distinct datagroups and proceeding further should not really be a problem.

SELECT     
    id, 
     MAX(CASE data WHEN 'A' THEN data ELSE '' END) + 
     MAX(CASE data WHEN 'B' THEN data ELSE '' END) + 
     MAX(CASE data WHEN 'C' THEN data ELSE '' END) + 
     MAX(CASE data WHEN 'D' THEN data ELSE '' END) AS DataGroups
FROM  t1
GROUP BY id

ID  DataGroups
1   A
2   AB
3   AB
4   C
5   D
6   AB

However, this kind of logic will only work in case you the "Data" values are both fixed and known before hand.

In your case, you do say that is the case. However, considering that you also say that they are 1000 of them, this will be frankly, a ridiculous looking query for sure :-)

LuckyLuke's suggestion above would, frankly, be the more generic way and probably saner way to go about implementing the solution though in your case.

From your sample data (having added the missing 2,'A' tuple, the following gives the renumbered (and uniqueified) data:

with NonDups as (
select t1.id
from #t1 t1 left join #t1 t2
on t1.id > t2.id and t1.data = t2.data
group by t1.id
having COUNT(t1.data) > COUNT(t2.data)
), DataAddedBack as (
    select ID,data
    from #t1 where id in (select id from NonDups)
), Renumbered as (
    select DENSE_RANK() OVER (ORDER BY id) as ID,Data from DataAddedBack
)
select * from Renumbered

Giving:

1          A
2          A
2          B
3          C
4          D

I think then, it's a matter of relational division to match up rows from this output with the rows in the original table.

Just to share my own dirty solution that I'm using for the moment:

SELECT DISTINCT t1.id, D.data
FROM #t1 t1
CROSS APPLY ( 
    SELECT CAST(Data AS VARCHAR) + ','
    FROM #t1 t2
    WHERE t2.id = t1.id
    ORDER BY Data ASC
    FOR XML PATH('') )  
D ( Data )

And then going analog to LuckyLuke's solution.

继续阅读：distinct-values sql sql-server-2008 tsql

SELECT DISTINCT for data groups

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？