开发者

Fastest way to calculate sum of bits in byte array

I have two byte arrays with the same length. I need to perform XOR operation between each byte and after this calculate sum of bits.

For example:

11110000^01010101 = 10100101 -> so 1+1+1+1 = 4

I need do the same operation for each element in byt开发者_JAVA技巧e array.


Use a lookup table. There are only 256 possible values after XORing, so it's not exactly going to take a long time. Unlike izb's solution though, I wouldn't suggest manually putting all the values in though - compute the lookup table once at startup using one of the looping answers.

For example:

public static class ByteArrayHelpers
{
    private static readonly int[] LookupTable =
        Enumerable.Range(0, 256).Select(CountBits).ToArray();

    private static int CountBits(int value)
    {
        int count = 0;
        for (int i=0; i < 8; i++)
        {
           count += (value >> i) & 1;
        }
        return count;
    }

    public static int CountBitsAfterXor(byte[] array)
    {
        int xor = 0;
        foreach (byte b in array)
        {
            xor ^= b;
        }
        return LookupTable[xor];
    }
}

(You could make it an extension method if you really wanted...)

Note the use of byte[] in the CountBitsAfterXor method - you could make it an IEnumerable<byte> for more generality, but iterating over an array (which is known to be an array at compile-time) will be faster. Probably only microscopically faster, but hey, you asked for the fastest way :)

I would almost certainly actually express it as

public static int CountBitsAfterXor(IEnumerable<byte> data)

in real life, but see which works better for you.

Also note the type of the xor variable as an int. In fact, there's no XOR operator defined for byte values, and if you made xor a byte it would still compile due to the nature of compound assignment operators, but it would be performing a cast on each iteration - at least in the IL. It's quite possible that the JIT would take care of this, but there's no need to even ask it to :)


Fastest way would probably be a 256-element lookup table...

int[] lut
{
    /*0x00*/ 0,
    /*0x01*/ 1,
    /*0x02*/ 1,
    /*0x03*/ 2
    ...
    /*0xFE*/ 7,
    /*0xFF*/ 8
}

e.g.

11110000^01010101 = 10100101 -> lut[165] == 4


This is more commonly referred to as bit counting. There are literally dozens of different algorithms for doing this. Here is one site which lists a few of the more well known methods. There are even CPU specific instructions for doing this.

Theorectically, Microsoft could add a BitArray.CountSetBits function that gets JITed with the best algorithm for that CPU architecture. I, for one, would welcome such an addition.


As I understood it you want to sum the bits of each XOR between the left and right bytes.

for (int b = 0; b < left.Length; b++) {
  int num = left[b] ^ right[b];
  int sum = 0;

  for (int i = 0; i < 8; i++) {
    sum += (num >> i) & 1;
  }

   // do something with sum maybe?
}


I'm not sure if you mean sum the bytes or the bits. To sum the bits within a byte, this should work:

int nSum = 0;
for (int i=0; i<=7; i++)
{
   nSum += (byte_val>>i) & 1;
}

You would then need the xoring, and array looping around this, of course.


The following should do

int BitXorAndSum(byte[] left, byte[] right) {
  int sum = 0;
  for ( var i = 0; i < left.Length; i++) { 
    sum += SumBits((byte)(left[i] ^ right[i]));
  }
  return sum;
}

int SumBits(byte b) {
  var sum = 0;
  for (var i = 0; i < 8; i++) {
    sum += (0x1) & (b >> i);
  }
  return sum;
}


It can be rewritten as ulong and use unsafe pointer, but byte is easier to understand:

static int BitCount(byte num)
{
    // 0x5 = 0101 (bit) 0x55 = 01010101
    // 0x3 = 0011 (bit) 0x33 = 00110011
    // 0xF = 1111 (bit) 0x0F = 00001111
    uint count = num;
    count = ((count >> 1) & 0x55) + (count & 0x55);
    count = ((count >> 2) & 0x33) + (count & 0x33);
    count = ((count >> 4) & 0xF0) + (count & 0x0F);
    return (int)count;
}


A general function to count bits could look like:

int Count1(byte[] a)
{
  int count = 0;
  for (int i = 0; i < a.Length; i++)
  {
    byte b = a[i];
    while (b != 0)
    {
      count++;
      b = (byte)((int)b & (int)(b - 1));
    }
  }
  return count;
}

The less 1-bits, the faster this works. It simply loops over each byte, and toggles the lowest 1 bit of that byte until the byte becomes 0. The castings are necessary so that the compiler stops complaining about the type widening and narrowing.

Your problem could then be solved by using this:

int Count1Xor(byte[] a1, byte[] a2)
{
  int count = 0;
  for (int i = 0; i < Math.Min(a1.Length, a2.Length); i++)
  {
    byte b = (byte)((int)a1[i] ^ (int)a2[i]);
    while (b != 0)
    {
      count++;
      b = (byte)((int)b & (int)(b - 1));
    }
  }
  return count;
}


A lookup table should be the fastest, but if you want to do it without a lookup table, this will work for bytes in just 10 operations.

public static int BitCount(byte value) {
    int v = value - ((value >> 1) & 0x55);
    v = (v & 0x33) + ((v >> 2) & 0x33);
    return ((v + (v >> 4) & 0x0F));
}

This is a byte version of the general bit counting function described at Sean Eron Anderson's bit fiddling site.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜