Dive into the assembly
Function in c
:
PHPAPI char *php_pcre_replace(char *regex, int regex_len,
char *subject, int subject_len,
zval *replace_val, int is_callable_replace,
int *result_len, int limit, int *replace_count TSRMLS_DC)
{
pc开发者_运维百科re_cache_entry *pce; /* Compiled regular expression */
/* Compile regex or get it from cache. */
if ((pce = pcre_get_compiled_regex_cache(regex, regex_len TSRMLS_CC)) == NULL) {
return NULL;
}
....
}
Its assembly:
php5ts!php_pcre_replace:
1015db70 8b442408 mov eax,dword ptr [esp+8]
1015db74 8b4c2404 mov ecx,dword ptr [esp+4]
1015db78 56 push esi
1015db79 8b74242c mov esi,dword ptr [esp+2Ch]
1015db7d 56 push esi
1015db7e 50 push eax
1015db7f 51 push ecx
1015db80 e8cbeaffff call php5ts!pcre_get_compiled_regex_cache (1015c650)
1015db85 83c40c add esp,0Ch
1015db88 85c0 test eax,eax
1015db8a 7502 jne php5ts!php_pcre_replace+0x1e (1015db8e)
php5ts!php_pcre_replace+0x1c:
1015db8c 5e pop esi
1015db8d c3 ret
The c function call pcre_get_compiled_regex_cache(regex, regex_len TSRMLS_CC)
corresponds to 1015db7d~1015db80
which pushes the 3 parameters to the stack and call it.
But my doubt is,among so many registers,how does the compiler decide to use eax
,ecx
and esi
(this is special,as it's restored before using,why?) as the intermediate to carry to the stack?
There must be some hidden indication in c that tells the compiler to do it this way,right?
No, there is no hidden indication.
This is a typical strategy for generating 80x86 instructions used by many compiler implementations, C
and otherwise. For example, the 1980s Intel Fortran-77 compiler, when optimization was turned on, did the same thing.
That is uses eax
and ecx
preferentially is probably an artifact of avoiding use of esi
and edi
since those registers cannot directly be used to load byte operands.
Why not ebx
and edx
? Well, those are preferred by many code generators for holding intermediate pointers in evaluating complex structure evaluation, which is to say, there isn't much reason at all. The compiler just looked for two available registers to use and overwrote them to buffer the values.
Why not reuse eax
like this?:
push esi
mov eax,dword ptr [esp+2Ch]
push eax
mov eax,dword ptr [esp+8]
push eax
mov eax,dword ptr [esp+4]
push eax
Because that causes pipeline stalls waiting for eax
to complete previous memory cycles, in 80x86s since the 80586 (maybe 80486—it's too long ago to be sure off the top of my head).
The x86 architecture is a strange beast. Each register, though promoted as being "general purpose" by Intel, has its quirks (cx
/ecx
is tied to the loop
instruction for example, and eax:edx
is tied to the multiply instruction). That combined with the peculiar ways to optimize execution to avoid cache misses and pipeline stalls often leads to inscrutable generated code by a code generator which factors all that in.
精彩评论