Just out of curiosity: how come linux kernel "optimized" strcpy is much slower the libc imp?
I tried to benchmark optimized string operations under http://lxr.linux.no/#linux+v2.6.38/arch/x86/lib/string_32.c and compare to regular strcpy:
#include<stdio.h>
#include<stdlib.h>
char *_strcpy(char *dest, const char *src)
{
int d0, d1, d2;
asm volatile("1:\tlodsb\n\t"
"stosb\n\t"
"testb %%al,%%al\n\t"
"jne 1b"
: "=&S" (d0), "=&D" (d1), "=&a" (d2)
: "0" (src), "1" (dest) : "memory");
return dest;
}
int main(int argc, char **argv){
int times = 1开发者_C百科;
if(argc >1)
{
times = atoi(argv[1]);
}
char a[100];
for(; times; times--)
_strcpy(a, "Hello _strcpy!");
return 0;
}
and timeing it using (time .. ) showed that it is about x10 slower than regular strcpy (under x64 linux)
Why?
If your string is constant, it's possible that the compiler is inlining the copy (for the plain strcpy
call), making it into a series of unconditional MOV instructions.
since this is linear code without conditions, it would be faster than the linux variant.
精彩评论