开发者

Memory comparison (with difference position)

Is there a way to compare two blocks of memory, and know at which point they differ (memcmp() does not meet this requirem开发者_StackOverflow中文版ent)? I wouldn't want to perform costly loops. Thanks in advance.

Regards, Neo_b


std::mismatch will do that for you in conjunction std::distance.


Compared to whatever else you are doing, a loop is cheap: the big cost will be retrieving the data from ram (or disk!) in the first place.


You can't avoid looping with memory comparison of more than a few bytes. Write the algorithm as you can imagine it. It's simple enough and you might be amazed how well the compiler optimizes code like this.


memcmp simply does a "costly loop", byte for byte. For example, here is Microsoft's implementation:

EXTERN_C int __cdecl memcmp(const void *Ptr1, const void *Ptr2, size_t Count)
{
    INT v = 0;
    BYTE *p1 = (BYTE *)Ptr1;
    BYTE *p2 = (BYTE *)Ptr2;

    while(Count-- > 0 && v == 0) {
        v = *(p1++) - *(p2++);
    }

    return v;
}

Most other implementations do the exact same thing. For your needs, you could do something like this:

long my_memcmp(const void *Ptr1, const void *Ptr2, size_t Count)
{
    INT v = 0;
    long pos = 0;
    BYTE *p1 = (BYTE *)Ptr1;
    BYTE *p2 = (BYTE *)Ptr2;

    while(Count-- > 0 && v == 0) 
    {
        v = *(p1++) - *(p2++);
        if (v == 0)
            pos++;
        else
            break;
    }

    return pos;
}


If there was a better way of comparing two blocks of memory, memcmp would be reimplemented to do that.

Having said that often, memcmp has a default portable implementation in the standard C library but there are is often implemented by the compiler itself as a builtin function. This builtin function should be highly optimized for the target architecture.So take the library implementation with a pinch of salt.


You will always need a loop. But you could benchmark if looping by 4 bytes (cast to int*) or by 8 bytes (uint64_t or long long int) is faster than the naive per-byte solution.

Even better, depending on the length (say, >1kb) you could unroll the loop, meaning you check e.g. per 8 int/uint64_t and on a mismatch pinpoint the first differing byte.

uint64_t *bigsteps1 = (uint64_t*)m1;
uint64_t *bigsteps2 = (uint64_t*)m2;
int steps = min(m1_len,m2_len)/sizeof(uint64_t);
int i;
for ( i=0; i<steps; i+=8 )
{
    if ( bigsteps1[i]   != bigsteps2[i]
      || bigsteps1[i+1] != bigsteps2[i+1]
   /// ....
      || bigsteps1[i+7] != bigsteps2[i+7] ) break;
}

// i<steps tells if we found a difference
// end game is trivial, left as an excercise to the reader.

The loop unroll may also backfire, for you have all these +N things in there and the i+=8 as well. Benchmark to be sure.

ps also check memory alignment: this will be fastest when m1&0xff == m2&0xff == 0

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜