开发者

C# hashcode for array of ints

I have a class that internally is just an array开发者_如何学Go of integers. Once constructed the array never changes. I'd like to pre-compute a good hashcode so that this class can be very efficiently used as a key in a Dictionary. The length of the array is less than about 30 items, and the integers are between -1000 and 1000 in general.


Not very clever, but sufficient for most practical purposes:

EDIT: changed due to comment of Henk Holterman, thanks for that.

  int hc = array.Length;
  foreach (int val in array)
  {
      hc = unchecked(hc * 314159 + val);
  }

If you need something more sophisticated, look here.


For an array of values generally between -1000 and 1000, I would probably use something like this:

static int GetHashCode(int[] values)
{
   int result = 0;
   int shift = 0;
   for (int i = 0; i < values.Length; i++)
   {
      shift = (shift + 11) % 21;
      result ^= (values[i]+1024) << shift;
   }
   return result;
}


You may use CRC32 checksum. Here is the code:

[CLSCompliant(false)]
public class Crc32 {
    uint[] table = new uint[256];
    uint[] Table { get { return table; } }

    public Crc32() {
        MakeCrcTable();
    }
    void MakeCrcTable() {
        for (uint n = 0; n < 256; n++) {
            uint value = n;
            for (int i = 0; i < 8; i++) {
                if ((value & 1) != 0)
                    value = 0xedb88320 ^ (value >> 1);
                else
                    value = value >> 1;
            }
            Table[n] = value;
        }
    }
    public uint UpdateCrc(uint crc, byte[] buffer, int length) {
        uint result = crc;
        for (int n = 0; n < length; n++) {
            result = Table[(result ^ buffer[n]) & 0xff] ^ (result >> 8);
        }
        return result;
    }
    public uint Calculate(Stream stream) {
        long pos = stream.Position;
        const int size = 0x32000;
        byte[] buf = new byte[size];
        int bytes = 0;
        uint result = 0xffffffff;
        do {
            bytes = stream.Read(buf, 0, size);
            result = UpdateCrc(result, buf, bytes);
        }
        while (bytes == size);
        stream.Position = pos;
        return ~result;
    }
}


I think choosing a good hash-algorithm would have to be based on the distribution (in a probability sense) of the integer values.

Have a look at Wikipedia for a list of algorithms


Any CRC (or even XOR) should be ok.


You could take a different approach and use a recursive dictionary for each value in your int array. This way you can leave .net to do primitive type hashing.

internal class DictionaryEntry<TKey, TValue>
{
    public Dictionary<TKey, DictionaryEntry<TKey, TValue>> Children { get; private set; }
    public TValue Value { get; private set; }
    public bool HasValue { get; private set; }

    public void SetValue(TValue value)
    {
        Value = value;
        HasValue = true;
    }

    public DictionaryEntry()
    {
        Children = new Dictionary<TKey, DictionaryEntry<TKey, TValue>>();
    }
}

internal class KeyStackDictionary<TKey, TValue>
{
    // Helper dictionary to work with a stack of keys
    // Usage:
    // var dict = new KeyStackDictionary<int, string>();
    // int[] keyStack = new int[] {23, 43, 54};
    // dict.SetValue(keyStack, "foo");
    // string value;
    // if (dict.GetValue(keyStack, out value))
    // {   
    // }

    private DictionaryEntry<TKey, TValue> _dict;

    public KeyStackDictionary()
    {
        _dict = new DictionaryEntry<TKey, TValue>();
    }

    public void SetValue(TKey[] keyStack, TValue value)
    {
        DictionaryEntry<TKey, TValue> dict = _dict;

        for (int i = 0; i < keyStack.Length; i++)
        {
            TKey key = keyStack[i];
            if (dict.Children.ContainsKey(key))
            {
                dict = dict.Children[key];
            }
            else
            {
                var child = new DictionaryEntry<TKey, TValue>();
                dict.Children.Add(key, child);
                dict = child;
            }

            if (i == keyStack.Length - 1)
            {
                dict.SetValue(value);
            }
        }
    }

    // returns false if the value is not found using the key stack
    public bool GetValue(TKey[] keyStack, out TValue value)
    {
        DictionaryEntry<TKey, TValue> dict = _dict;

        for (int i = 0; i < keyStack.Length; i++)
        {
            TKey key = keyStack[i];

            if (dict.Children.ContainsKey(key))
            {
                dict = dict.Children[key];
            }
            else
            {
                break;
            }

            if (i == keyStack.Length - 1 && dict.HasValue)
            {
                value = dict.Value;
                return true;
            }
        }

        value = default(TValue);
        return false;
    }
}


You can use Linq methods too:

var array = new int[10];
var hashCode = array.Aggregate(0, (a, v) => 
    HashCode.Combine(a, v.GetHashCode()));


I'm using this here

var arrayHash = string.Join(string.Empty, array).GetHashCode();

If a element changed in the array, you will get a new hash.


I would recommend:

HashCode.Combine(array)

For .NET Core 2.1 / .NET Standard 2.1 / .NET 5 and later.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜