How to use IEqualityComparer<T>.Equals() in ToLookUp<T>() Extension
I stumbled upon an article regarding the Birthday Paradox and it's implications when overriding the GetHashCode
method, I find myself in a bind.
In tests, we found that in calls to the ToLookup()
Extension, only GetHashcode
is used, despite providing the implementation for Equals.
I think I understand why this happens, the internal working of ToLookup
, HashSet
, Dictionary
, etc, use the HashCodes to store and/or index their elements?
Is there a way to somehow provide the functionality so that the equality comparison is actual performed using the equals method? Or should I not be concerned with the collisions? I haven't done the maths myself, but according to the first article I linked, you would only need 77,163 elements in a list before reaching a 50% chance of collision.
If I understand this correctly, an Equals()
override that compares property by property such as
Return (a.Property1 == b.Property1 && a.Property2 == b.Property2 && ...)
should have a zero chance of collision? So how can I get my ToLookup()
to equality compare this way?
In case you need an example of what I mean:
C#
class Program
{
static void Main(string[] args)
{
DoStuff();
Console.ReadKey();
}
public class AnEntity
{
public int KeyProperty1 { get; set; }
public int KeyProperty2 { get; set; }
public int KeyProperty3 { get; set; }
public string OtherProperty1 { get; set; }
public List<string开发者_C百科> OtherProperty2 { get; set; }
}
public class KeyEntity
{
public int KeyProperty1 { get; set; }
public int KeyProperty2 { get; set; }
public int KeyProperty3 { get; set; }
}
public static void DoStuff()
{
var a = new AnEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3, OtherProperty1 = "foo"};
var b = new AnEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3, OtherProperty1 = "bar"};
var c = new AnEntity {KeyProperty1 = 999, KeyProperty2 = 999, KeyProperty3 = 999, OtherProperty1 = "yada"};
var entityList = new List<AnEntity> { a, b, c };
var lookup = entityList.ToLookup(n => new KeyEntity {KeyProperty1 = n.KeyProperty1, KeyProperty2 = n.KeyProperty2, KeyProperty3 = n.KeyProperty3});
// I want these to all return true
Debug.Assert(lookup.Count == 2);
Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3}].First().OtherProperty1 == "foo");
Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3}].Last().OtherProperty1 == "bar");
Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 999, KeyProperty2 = 999, KeyProperty3 = 999}].Single().OtherProperty1 == "yada");
}
}
VB
Module Program
Public Sub Main(args As String())
DoStuff()
Console.ReadKey()
End Sub
Public Class AnEntity
Public Property KeyProperty1 As Integer
Public Property KeyProperty2 As Integer
Public Property KeyProperty3 As Integer
Public Property OtherProperty1 As String
Public Property OtherProperty2 As List(Of String)
End Class
Public Class KeyEntity
Public Property KeyProperty1 As Integer
Public Property KeyProperty2 As Integer
Public Property KeyProperty3 As Integer
End Class
Public Sub DoStuff()
Dim a = New AnEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3, .OtherProperty1 = "foo"}
Dim b = New AnEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3, .OtherProperty1 = "bar"}
Dim c = New AnEntity With {.KeyProperty1 = 999, .KeyProperty2 = 999, .KeyProperty3 = 999, .OtherProperty1 = "yada"}
Dim entityList = New List(Of AnEntity) From {a, b, c}
Dim lookup = entityList.ToLookup(Function(n) New KeyEntity With {.KeyProperty1 = n.KeyProperty1, .KeyProperty2 = n.KeyProperty2, .KeyProperty3 = n.KeyProperty3})
' I want these to all return true
Debug.Assert(lookup.Count = 2)
Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3}).First().OtherProperty1 = "foo")
Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3}).Last().OtherProperty1 = "bar")
Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 999, .KeyProperty2 = 999, .KeyProperty3 = 999}).Single().OtherProperty1 = "yada")
End Sub
End Module
I can get that to work with an override of GetHashcode()
, no problems. But I don't want to use GetHashcode
because if I have, for example, 109,125 elements in my list, apparently I'm already at 75% chance of collision? If it used aforementioned Equals()
override, I think I'd be at 0%?
The article that you've linked to is completely misleading (and many of its comments highlight this).
GetHashCode
is used where possible because it's fast; if there are hash collisions then Equals
is used to disambiguate between the colliding items. So long as you implement Equals
and GetHashCode
correctly -- whether in the types themselves or a custom IEqualityComparer<T>
implementation -- then there won't be any problems.
The problem with your example code is that you're not overriding Equals
and GetHashCode
at all. This means that the the default implementations are used, and the default implementations use reference comparisons for reference types, not value comparisons.
This means that you're not getting hash collisions because the objects you're comparing against are different to the original objects, even though they have the same values. This, in turn, means that Equals
just isn't required by your example code. Override Equals
and GetHashCode
correctly, or set up an IEqualityComparer<T>
to do so, and everything will start working as you expect.
The birthday paradox does not apply in this situation. The birthday paradox relates to non-deterministic random sets, whereas hashcode computation is determinitic. the chances of 2 objects with different state sharing the same hashcode is much closer to 1 in a billion or so, certainly not as low as 77 thousand - therefore I dont think you have anything to worry about.
精彩评论