Distinct by property of class with LINQ [duplicate]
I have a collection:
List<Car> cars = new List<Car>();
Cars are uniquely identified by their property CarCode
.
I have three cars in the collection, and two with identical CarCodes.
How can I use LINQ to convert thi开发者_如何学编程s collection to Cars with unique CarCodes?
You can use grouping, and get the first car from each group:
List<Car> distinct =
cars
.GroupBy(car => car.CarCode)
.Select(g => g.First())
.ToList();
Use MoreLINQ, which has a DistinctBy
method :)
IEnumerable<Car> distinctCars = cars.DistinctBy(car => car.CarCode);
(This is only for LINQ to Objects, mind you.)
Same approach as Guffa but as an extension method:
public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> items, Func<T, TKey> property)
{
return items.GroupBy(property).Select(x => x.First());
}
Used as:
var uniqueCars = cars.DistinctBy(x => x.CarCode);
You can implement an IEqualityComparer and use that in your Distinct extension.
class CarEqualityComparer : IEqualityComparer<Car>
{
#region IEqualityComparer<Car> Members
public bool Equals(Car x, Car y)
{
return x.CarCode.Equals(y.CarCode);
}
public int GetHashCode(Car obj)
{
return obj.CarCode.GetHashCode();
}
#endregion
}
And then
var uniqueCars = cars.Distinct(new CarEqualityComparer());
Another extension method for Linq-to-Objects, without using GroupBy:
/// <summary>
/// Returns the set of items, made distinct by the selected value.
/// </summary>
/// <typeparam name="TSource">The type of the source.</typeparam>
/// <typeparam name="TResult">The type of the result.</typeparam>
/// <param name="source">The source collection.</param>
/// <param name="selector">A function that selects a value to determine unique results.</param>
/// <returns>IEnumerable<TSource>.</returns>
public static IEnumerable<TSource> Distinct<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
{
HashSet<TResult> set = new HashSet<TResult>();
foreach(var item in source)
{
var selectedValue = selector(item);
if (set.Add(selectedValue))
yield return item;
}
}
I think the best option in Terms of performance (or in any terms) is to Distinct using the The IEqualityComparer interface.
Although implementing each time a new comparer for each class is cumbersome and produces boilerplate code.
So here is an extension method which produces a new IEqualityComparer on the fly for any class using reflection.
Usage:
var filtered = taskList.DistinctBy(t => t.TaskExternalId).ToArray();
Extension Method Code
public static class LinqExtensions
{
public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> items, Func<T, TKey> property)
{
GeneralPropertyComparer<T, TKey> comparer = new GeneralPropertyComparer<T,TKey>(property);
return items.Distinct(comparer);
}
}
public class GeneralPropertyComparer<T,TKey> : IEqualityComparer<T>
{
private Func<T, TKey> expr { get; set; }
public GeneralPropertyComparer (Func<T, TKey> expr)
{
this.expr = expr;
}
public bool Equals(T left, T right)
{
var leftProp = expr.Invoke(left);
var rightProp = expr.Invoke(right);
if (leftProp == null && rightProp == null)
return true;
else if (leftProp == null ^ rightProp == null)
return false;
else
return leftProp.Equals(rightProp);
}
public int GetHashCode(T obj)
{
var prop = expr.Invoke(obj);
return (prop==null)? 0:prop.GetHashCode();
}
}
You can't effectively use Distinct
on a collection of objects (without additional work). I will explain why.
The documentation says:
It uses the default equality comparer,
Default
, to compare values.
For objects that means it uses the default equation method to compare objects (source). That is on their hash code. And since your objects don't implement the GetHashCode()
and Equals
methods, it will check on the reference of the object, which are not distinct.
Another way to accomplish the same thing...
List<Car> distinticBy = cars
.Select(car => car.CarCode)
.Distinct()
.Select(code => cars.First(car => car.CarCode == code))
.ToList();
It's possible to create an extension method to do this in a more generic way. It would be interesting if someone could evalute performance of this 'DistinctBy' against the GroupBy approach.
You can check out my PowerfulExtensions library. Currently it's in a very young stage, but already you can use methods like Distinct, Union, Intersect, Except on any number of properties;
This is how you use it:
using PowerfulExtensions.Linq;
...
var distinct = myArray.Distinct(x => x.A, x => x.B);
精彩评论