Linq to Sql: Select distinct rows while ignoring specified columns
If I have a query/results set as follows...
from t1 in table1
join t2 in table2 on t1.ID equals t2.ID
join t3 in table3 on t2.ID equals t3.ID
select new MyClass()
{
ID = 开发者_开发问答t1.ID,
Name = t2.Name,
Rank = t2.Rank,
City = t3.City
}
ID | Name | City | Rank
01 | Test | Fake | 876
01 | Test | Fake | 755
02 | Blah | Fake | 765
Doing a .Distinct() will return all 3 records, but what if I want the first and third record and don't want to eliminate any columns from my results set? Is there a way to specify columns to ignore when doing a distinct, or explicitly include in a distinct so that I could do something like this...
// pseudo code
.Distinct(o => o.Name.FirstOrDefault())
I think it's possible to do this using group by, but that seems like it would be slow and messy, especially if I needed to group on multiple columns in order to not exclude too many rows. Any ideas? Thanks.
No, you can't specify a projection by which to determine equality for Distinct
. (I have extension methods to do this for LINQ to Objects, but that's a different matter.)
I suggest you use GroupBy
, possibly in conjunction with First
:
var query = from t1 in table1
join t2 in table2 on t1.ID equals t2.ID
join t3 in table3 on t2.ID equals t3.ID
select new { t1.ID, t2.Name, t2.Rank, t3.City } into tuple
group tuple by new { tuple.ID, tuple.Name, tuple.City } into grouping
let first = grouping.First() // For convenience
select new MyClass
{
ID = first.ID, City = first.City,
Name = first.Name, Rank = first.Rank
};
Note that the results of this may not be deterministic - you're taking the first item of each group, ignoring the rank - so you may get the record with rank 876 or you may get the one with rank 755 in your example.
I don't know exactly what this will look like in SQL - but you should check via logging, and then run the query profiler to see what the cost will be like.
精彩评论