Overriding GetHashCode in VB without checked/unchecked keyword support?
So I'm trying to figure out how to correctly override GetHashCode()
in VB for a large number of custom objects. A bit of searching leads me to this wonderful answer.
Except there's one problem: VB lacks both the checked
and unchecked
keyword in .NET 4.0. As far as I can tell, anyways. So using Jon Skeet's implementation, I tried creating such an override on a rather simple class that has three main members: Name As String
, Value As Int32
, and [Type] As System.Type
. Thus I come up with:
Public Overrides Function GetHashCode() As Int32
Dim hash As Int32 = 17
hash = hash * 23 + _Name.GetHashCode()
hash = hash * 23 + _Value
hash = hash * 23 + _Type.GetHashCode()
Return hash
End Function
Problem: Int32 is too small for even a simple object such as this. The particular instance I tested has "Name" as a simple 5-character string, and that hash alone was close enough to Int32's upper limit, that when it tried to calc the second field of the hash (Value), it overflowed. Because I can't find a VB equivalent for granular checked
/unchecked
support, I can't work around this.
I also do not want to remove Integer overflow checks across the entire project. This thing is maybe....40% complete (I made that up, TBH), and I have a lot more code to write, so I need these overflow checks in place for quite some time.
What would be the "safe" version of Jon's GetHashCode
version for VB and Int32? Or, does .NET 4.0 have checked
/unchecked
in it somewhere that I'm not finding very easily on MSDN?
Translated from from C# into a more readable VB and aligned to the object described above (Name, Value, Type), we get:
Public Overrides Function GetHashCode() As Int32
Return New With { _
Key .A = _Name, _
Key .B = _Value, _
Key .C = _Type
}.GetHashCode()
End Function
This triggers the compiler apparently to "cheat" by generating an anonymous type, which it then compiles outside of the project namespace, presumably with integer overflow checks disabled, and allows the math to take place and simply wrap around when it overflows. It also seems to involve box
opcodes, which I know to be performance hits. No unboxing, though.
But this raises an interesting question. Countless times, I've seen it stated here and elsewhere that both VB and C# generate the same IL code. This is clearly not the case 100% of the time...Like the use of C#'s </rhetorical-question>unchecked
keyword simply causes a different opcode to get emitted. So why do I continue to see the assumption that both produce the exact same IL keep getting repeated?
Anyways, I'd rather find a solution that can be implemented within each object module. Having to create Anonymous Types for every single one of my objects is going to look messy from an ILDASM perspective. I'm not kidding when I say I have a lot of classes implemented in my project.
EDIT2: I did open up a bug on MSFT Connect, and the gist of the outcome from the VB PM was that they'll consider it, but don't hold your breath: https://connect.microsoft.com/VisualStudio/feedback/details/636564/checked-unchecked-keywords-in-visual-basicA quick look at the changes in .NET 4.5 suggests they've not considered it yet, so maybe .NET 5?
My final implementation, which fits the constraints of GetHashCode, while still being fast and unique enough for VB is below, derived from the "Rotating Hash" example on this page:
'// The only sane way to do hashing in VB.NET because it lacks the
'// checked/unche开发者_运维问答cked keywords that C# has.
Public Const HASH_PRIME1 As Int32 = 4
Public Const HASH_PRIME2 As Int32 = 28
Public Const INT32_MASK As Int32 = &HFFFFFFFF
Public Function RotateHash(ByVal hash As Int64, ByVal hashcode As Int32) As Int64
Return ((hash << HASH_PRIME1) Xor (hash >> HASH_PRIME2) Xor hashcode)
End Function
I also think the "Shift-Add-XOR" hash may also apply, but I haven't tested it.
Use Long to avoid the overflow:
Dim hash As Long = 17
'' etc..
Return CInt(hash And &H7fffffffL)
The And operator ensures no overflow exception is thrown. This however does lose one bit of "precision" in the computed hash code, the result is always positive. VB.NET has no built-in function to avoid it, but you can use a trick:
Imports System.Runtime.InteropServices
Module NoOverflows
Public Function LongToInteger(ByVal value As Long) As Integer
Dim cast As Caster
cast.LongValue = value
Return cast.IntValue
End Function
<StructLayout(LayoutKind.Explicit)> _
Private Structure Caster
<FieldOffset(0)> Public LongValue As Long
<FieldOffset(0)> Public IntValue As Integer
End Structure
End Module
Now you can write:
Dim hash As Long = 17
'' etc..
Return NoOverflows.LongToInteger(hash)
Here is an implementation combining Hans Passant's answer and Jon Skeet's answer.
It works even for millions of properties (i.e. no integer overflow exceptions) and is very fast (less than 20 ms to generate hash code for a class with 1,000,000 fields and barely measurable for a class with only 100 fields).
Here is the structure to handle the overflows:
<StructLayout(LayoutKind.Explicit)>
Private Structure HashCodeNoOverflow
<FieldOffset(0)> Public Int64 As Int64
<FieldOffset(0)> Public Int32 As Int32
End Structure
And a simple GetHashCode function:
Public Overrides Function GetHashCode() As Integer
Dim hashCode As HashCodeNoOverflow
hashCode.Int64 = 17
hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field1.GetHashCode
hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field2.GetHashCode
hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field3.GetHashCode
Return hashCode.Int32
End Function
Or if your prefer:
Public Overrides Function GetHashCode() As Integer
Dim hashCode = New HashCodeNoOverflow With {.Int32 = 17}
For Each field In Fields
hashCode.Int64 = CLng(hashCode.Int32) * 23 + field.GetHashCode
Next
Return hashCode.Int32
End Function
I had the same problem implementing Mr. Skeet's solution in vb.net. I ended up using the Mod operator to get there. Each Mod by Integer.MaxValue should return just the least significant component up to that point and will always be within Integer.MaxValue and Integer.MinValue -- which should have the same effect as unchecked. You probably don't have to mod as often as I do (it's only when there's a chance of getting bigger than a long (which would mean combining a LOT of hash codes) and then once at the end) but a variant of this works for me (and lets you play with using much bigger primes like some of the other hash functions without worrying).
Public Overrides Function GetHashCode() As Int32
Dim hash as Int64 = 17
hash = (hash * 23 + _Name.GetHashCode()) Mod Integer.MaxValue
hash = (hash * 23 + _Value) Mod Integer.MaxValue
hash = (hash * 23 + _Type.GetHashCode()) Mod Integer.MaxValue
Return Convert.ToInt32(hash)
End Function
You can implement a suitable hash code helper in a separate assembly either using C# and the unchecked
keyword or turning overflow checking of for the entire project (possible in both VB.NET and C# projects). If you want to you can then use ilmerge
to merge this assembly to your main assembly.
Improved answer Overriding GetHashCode in VB without checked/unchecked keyword support?
Public Overrides Function GetHashCode() as Integer
Dim hashCode as Long = 0
If myReplacePattern IsNot Nothing Then _
hashCode = ((hashCode*397) Xor myField.GetHashCode()) And &HffffffffL
If myPattern IsNot Nothing Then _
hashCode = ((hashCode*397) Xor myOtherField.GetHashCode()) And &HffffffffL
Return CInt(hashCode)
End Function
There is a trimming after each multiplication. And literal is defined explicitly as Long because the And operator with an Integer argument does not zeroize the upper bytes.
After researching that VB had not given us anything like unchecked
and raging for a bit (c# dev now doing vb), I implemented a solution close to the one Hans Passant posted. I failed at it. Terrible performance. This was certainly due to my implementation and not the solution Hans posted. I could have gone back and more closely copied his solution.
However, I solved the problem with a different solution. A post complaining about lack of unchecked
on the VB language feature requests page gave me the idea to use a hash algorithm already in the framework. In my problem, I had a String
and Guid
that I wanted to use for a dictionary key. I decided a Tupple(Of Guid, String)
would be a fine internal data store.
Original Bad Version
Public Structure HypnoKey
Public Sub New(name As String, areaId As Guid)
_resourceKey = New Tuple(Of Guid, String)(resourceAreaId, key)
End Sub
Private ReadOnly _name As String
Private ReadOnly _areaId As Guid
Public ReadOnly Property Name As String
Get
Return _name
End Get
End Property
Public ReadOnly Property AreaId As Guid
Get
Return _areaId
End Get
End Property
Public Overrides Function GetHashCode() As Integer
'OMFG SO BAD
'TODO Fail less hard
End Function
End Structure
Much Improved Version
Public Structure HypnoKey
Public Sub New(name As String, areaId As Guid)
_innerKey = New Tuple(Of Guid, String)(areaId , key)
End Sub
Private ReadOnly _innerKey As Tuple(Of Guid, String)
Public ReadOnly Property Name As String
Get
Return _innerKey.Item2
End Get
End Property
Public ReadOnly Property AreaId As Guid
Get
Return _innerKey.Item1
End Get
End Property
Public Overrides Function GetHashCode() As Integer
Return _innerKey.GetHashCode() 'wow! such fast (enuf)
End Function
End Structure
So, while I expect there are far better solutions than this, I am pretty happy. My performance is good. Also, the nasty utility code is gone. Hopefully this is useful to some other poor dev forced to write VB who comes across this post.
Cheers
I've also found that RemoveIntegerChecks MsBuild property affects /removeintchecks VB compiler property that prevents compiler from emitting runtime checks:
<PropertyGroup>
<RemoveIntegerChecks>true</RemoveIntegerChecks>
</PropertyGroup>
精彩评论