C# byte[] comparison without bound checks
I am looking for performance efficient ways to开发者_如何学编程 compare two byte[] for equality. Sizes are above 1 MB, so the overhead for each array element should be minimized.
I aim to beat the speeds of SequenceEqual
or a hand-coded for-loop over every item, by avoiding the repetitive bound checks for both arrays. In the same way that Array.Copy
could lead to fast memcpy
, what will lead to a memcmp
?
You can use unsafe code to do pointer operations. You can compare the bytes four at a time as integers:
public static bool ArrayCompare(byte[] a, byte[] b) {
if (a.Length != b.Length) return false;
int len = a.Length;
unsafe {
fixed(byte* ap = a, bp = b) {
int* aip = (int*)ap, bip = (int*)bp;
for (;len >= 4;len-=4) {
if (*aip != *bip) return false;
aip++;
bip++;
}
byte* ap2 = (byte*)aip, bp2 = (byte*)bip;
for (;len>0;len--) {
if (*ap2 != *bp2) return false;
ap2++;
bp2++;
}
}
}
return true;
}
A tested this against a simple loop, and it's about six times faster.
As suggested by Josh Einstein, long could be used on a 64 bit system. Actually it seems to be almost twice as fast both on 32 and 64 bit systems:
public static bool ArrayCompare64(byte[] a, byte[] b) {
if (a.Length != b.Length) return false;
int len = a.Length;
unsafe {
fixed (byte* ap = a, bp = b) {
long* alp = (long*)ap, blp = (long*)bp;
for (; len >= 8; len -= 8) {
if (*alp != *blp) return false;
alp++;
blp++;
}
byte* ap2 = (byte*)alp, bp2 = (byte*)blp;
for (; len > 0; len--) {
if (*ap2 != *bp2) return false;
ap2++;
bp2++;
}
}
}
return true;
}
If performance really matters then the fastest way to do it is by using the CRT library included with every version of Windows. This code takes ~51 msec on my poky laptop, works on 64-bit machines too:
using System;
using System.Runtime.InteropServices;
using System.Diagnostics;
class Program {
static void Main(string[] args) {
byte[] arr1 = new byte[50 * 1024 * 1024];
byte[] arr2 = new byte[50 * 1024 * 1024];
var sw = Stopwatch.StartNew();
bool equal = memcmp(arr1, arr2, arr1.Length) == 0;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Console.ReadLine();
}
[DllImport("msvcrt.dll")]
private static extern int memcmp(byte[] arr1, byte[] arr2, int cnt);
}
From: http://www.pinvoke.net/default.aspx/msvcrt.memcmp : Belowmentioned signature (by Saar) of memcmp is a x64 only signature. Using the the x64 only signatures on an x86 machine will result in a PInvoke stack imbalance. For x86 and x64 platform compatibility make sure you use a signature which specifies the Cdecl calling convention and uses the UIntPtr type to correctly marshall the size_t count argument:
[DllImport("msvcrt.dll", CallingConvention = CallingConvention.Cdecl)]
static extern int memcmp(byte[] b1, byte[] b2, UIntPtr count);
static bool doImagesMatch(byte[] b1, byte[] b2)
{
return b1.Length == b2.Length && memcmp(b1, b2, new UIntPtr((uint)b1.Length)) == 0;
}
I use this code with success but I didn't have time to measure performance (yet). I'm using small array's of roughly 600 bytes. I have to use x86-compatible code because the vast majority of the computers in our non-profit organisation is x86.
Obviously you need a fast algorhythm to convert bitmap to byte[].
[DllImport("msvcrt.dll")] unsafe static extern int memcmp(void* b1, void* b2, long count);
unsafe static int ByteArrayCompare1(byte[] b1, int b1Index, int b1Length, byte[] b2, int b2Index, int b2Length)
{
CompareCount++;
fixed (byte* p1 = b1)
fixed (byte* p2 = b2)
{
int cmp = memcmp(p1 + b1Index, p2 + b2Index, Math.Min(b1Length, b2Length));
if (cmp == 0)
{
cmp = b1Length.CompareTo(b2Length);
}
return cmp;
}
}
精彩评论