开发者

Differences in dis-assembled C code of GCC and Borland?

Recently I have gotten interested into dis-assembling C code (very simple C code) and followed a tutorial that used Borland C++ Compiler v 5.5 (compiles C code just fine) and everything worked. Then I decided to try my own c code and compiled them in Dev C++ (which uses gcc). Upon opening it in IDA Pro I got a surprise, the asm of gcc was really different compared to Borland's. I expected some difference but the C code was EXTREMELY simple, so is it just that gcc doesn't optimize as much or is it that they use different default compiler settings?

The C Code

int main(int argc, char **argv)
{
   int a;
   a = 1;
}

Borland ASM

.text:00401150 ; int __cdecl main(int argc,const char **argv,const char *envp)
.text:00401150 _main           proc near               ; DATA XREF: .data:004090D0
.text:00401150
.text:00401150 argc            = dword ptr  8
.text:00401150 argv            = dword ptr  0Ch
.text:00401150 envp            = dword ptr  10h
.text:00401150
.text:00401150                 push    ebp
.text:00401151                 mov     ebp, esp
.text:00401153                 pop     ebp
.text:00401154                 retn
.text:00401154 _main           endp

GCC ASM (UPDATED BELLOW)

.text:00401220 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
.text:00401220
.text:00401220 ; Attributes: bp-based frame
.text:00401220
.text:00401220                 public start
.text:00401220 start           proc near
.text:00401220
.text:00401220 var_14          = dword ptr -14h
.text:00401220 var_8           = dword ptr -8
.text:00401220
.text:00401220                 push    ebp
.text:00401221                 mov     ebp, esp
.text:00401223                 sub     esp, 8
.text:00401226                 mov     [esp+8+var_8], 1
.text:0040122D                 call    ds:__set_app_type
.text:00401233                 call    sub_401100
.text:00401238                 nop
.text:00401239                 lea     esi, [esi+0]
.text:00401240                 push    ebp
.text:00401241                 mov     ebp, esp
.text:00401243                 sub     esp, 8
.text:00401246                 mov     [esp+14h+var_14], 2
.text:0040124D                 call    ds:__set_app_type
.text:00401253                 call    sub_401100
.text:00401258                 nop
.text:00401259                 lea     esi, [esi+0]
.text:00401259 start           endp

GCC Update Upon following the suggestion of JimR I went to see what sub_401100 is and then I followed that code to another and this seems to be the code (Am I correct in that assumption and if sowhy does GCC have all of its code in the main function?):

.text:00401100 sub_401100      proc near               ; CODE XREF: .text:004010F1j
.text:00401100                                         ; start+13p ...
.text:00401100
.text:00401100 var_28          = dword ptr -28h
.text:00401100 var_24          = dword ptr -24h
.text:00401100 var_20          = dword ptr -20h
.text:00401100 var_1C          = dword ptr -1Ch
.text:00401100 var_18          = dword ptr -18h
.text:00401100 var_C           = dword ptr -0Ch
.text:00401100 var_8           = dword ptr -8
.text:00401100
.text:00401100                 push    ebp
.text:00401101                 mov     ebp, esp
.text:00401103                 push    ebx
.text:00401104                 sub     esp, 24h        ; lpTopLevelExceptionFilter
.text:00401107                 lea     ebx, [ebp+var_8]
.text:0040110A                 mov     [esp+28h+var_28], offset sub_401000
.text:00401111                 call    SetUnhandledExceptionFilter
.text:00401116                 sub     esp, 4          ; uExitCode
.text:00401119                 call    sub_4012E0
.text:0040111E                 mov     [ebp+var_8], 0
.text:00401125                 mov     eax, offset dword_40400开发者_JAVA百科0
.text:0040112A                 lea     edx, [ebp+var_C]
.text:0040112D                 mov     [esp+28h+var_18], ebx
.text:00401131                 mov     ecx, dword_402000
.text:00401137                 mov     [esp+28h+var_24], eax
.text:0040113B                 mov     [esp+28h+var_20], edx
.text:0040113F                 mov     [esp+28h+var_1C], ecx
.text:00401143                 mov     [esp+28h+var_28], offset dword_404004
.text:0040114A                 call    __getmainargs
.text:0040114F                 mov     eax, ds:dword_404010
.text:00401154                 test    eax, eax
.text:00401156                 jz      short loc_4011B0
.text:00401158                 mov     dword_402010, eax
.text:0040115D                 mov     edx, ds:_iob
.text:00401163                 test    edx, edx
.text:00401165                 jnz     loc_4011F6

.text:004012E0 sub_4012E0      proc near               ; CODE XREF: sub_401000+C6p
.text:004012E0                                         ; sub_401100+19p
.text:004012E0                 push    ebp
.text:004012E1                 mov     ebp, esp
.text:004012E3                 fninit
.text:004012E5                 pop     ebp
.text:004012E6                 retn
.text:004012E6 sub_4012E0      endp


Compiler output is expected to be different, sometimes dramatically different for the same source. In the same way that a toyota and a honda are different. Four wheels and some seats sure, but more different than the same when you look at the details.

Likewise the same compiler with different compiler options can and often will produce dramatically different output for the same source code. Even for what appears to be simple programs.

In the case of your simple program, which actually does not do anything (code does not affect the input, nor output, nor anything outside the function), a good optimized compiler will result in nothing but main: with a return of some random number since you didnt specify the return value. Actually it should give a warning or error. This is the biggest problem I have when I compare compiler output is making something simple enough to see what they are doing but something complicated enough that the compiler does more than just pre-compute the answer and return it.

In the case of x86, which I assume is what you are talking about here, being microcoded these days there is really no answer for good code vs bad code, each family of processor they change the guts around and what used to be fast is slow and what is now fast is slow on the old processor. So for compilers like gcc that have continued to evolve with the new cores, the optimization can be both generic to all x86es or specific to a particular family (resulting in different code despite max optimization).

With your new interest in disassembling, you will continue to see the similarities and differences and find out just how many different ways the same code can be compiled. the differences are expected, even for trivial programs. And I encourage you to try as many compilers as you can. Even in the gcc family 2.x, 3.x, 4.x and the different ways to build it will result in different code for what might be though thought of as the same compiler.

Good vs bad output is in the eyes of the beholder. Folks that use debuggers will want their code steppable and their variables watchable (in written code order). This makes for very big, bulky, and slow code (particularly for x86). And when you compile for release you end up with a completely different program which you have so far spent zero time debugging. Also optimizing for performance you take a risk of the compiler optimizing out something you wanted it to do (your example above, no variable will be allocated, no code to step through, even with minor optimization). Or worse, you expose the bugs in the compiler and your program simply doesnt work (this is why -O3 is discouraged for gcc). That and/or you find out the large number of places in the C standard whose interpretation is implementation defined.

Unoptimized code is easier to compile, as it is a bit more obvious. In the case of your example the expectation is a variable is allocated on the stack, some sort of stack pointer arrangement set up, the immediate 1 is eventually written to that location, stack cleaned up and function returns. Harder for compilers to get wrong and more likely that your program works as you intended. Detecting and removing dead code is the business of optimization and that is where it gets risky. Often the risk is worth the reward. But that depends on the user, beauty is in the eye of the beholder.

Bottom line, short answer. Differences are expected (even dramatic differences). Default compile options vary from compiler to compiler. Experiment with the compile/optimization options and different compilers and continue to disassemble your programs in order to gain a better education about the language and the compilers you use. You are on the right track so far. In the case of the borland output, it detected that your program does nothing, no input variables are used, no return variables are used, nor related to the local variables, and no global variables or other external to the function resources are used. The integer a and the assignment of an immediate are dead code, a good optimizer will essentially remove/ignore both lines of code. So it bothered to setup a stack frame then clean it up which it didnt need to do, then returned. gcc looks to be setting up an exception handler which is perfectly fine even though it doesnt need to, start optimizing or use a function name other than main() and you should see different results.


What is most likely happening here is that Borland calls main from its start up code after initializing everything with code present in their run time lib.

The gcc code does not look like main to me, but like generated code that calls main. Disassemble the code at sub_401100 and see if it looks like your main proc.


First of all, make sure you have at least enabled the -O2 optimization flag to gcc, otherwise you get no optimization at all.

With this little example, you arn't really testing optimization, you're seeing how program initialization works, e.g. gcc calls __set_app_type to inform windows of the application type, as well as other initialization. e.g. sub_401100 registers atexit handlers for the runtime. Borland might call the runtime initialization beforehand, while gcc does it within main().


Here's the disassembly of main() that I get from MinGW's gcc 4.5.1 in gdb (I added a return 0 at the end so GCC wouldn't complain):

First, when the program is compiled with -O3 optimization:

(gdb) set disassembly-flavor intel
(gdb) disassemble
Dump of assembler code for function main:
   0x00401350 <+0>:     push   ebp
   0x00401351 <+1>:     mov    ebp,esp
   0x00401353 <+3>:     and    esp,0xfffffff0
   0x00401356 <+6>:     call   0x4018aa <__main>
=> 0x0040135b <+11>:    xor    eax,eax
   0x0040135d <+13>:    mov    esp,ebp
   0x0040135f <+15>:    pop    ebp
   0x00401360 <+16>:    ret
End of assembler dump.

And with no optimizations:

(gdb) set disassembly-flavor intel
(gdb) disassemble
Dump of assembler code for function main:
   0x00401350 <+0>:     push   ebp
   0x00401351 <+1>:     mov    ebp,esp
   0x00401353 <+3>:     and    esp,0xfffffff0
   0x00401356 <+6>:     sub    esp,0x10
   0x00401359 <+9>:     call   0x4018aa <__main>
=> 0x0040135e <+14>:    mov    DWORD PTR [esp+0xc],0x1
   0x00401366 <+22>:    mov    eax,0x0
   0x0040136b <+27>:    leave
   0x0040136c <+28>:    ret
End of assembler dump.

These are a little more complex than Borland's example, but not excessively.

Note, the calls to 0x4018aa are calls to a library/compiler supplied function to construct C++ objects. Here's a snippet from some GCC toolchain docs:

The actual calls to the constructors are carried out by a subroutine called __main, which is called (automatically) at the beginning of the body of main (provided main was compiled with GNU CC). Calling __main is necessary, even when compiling C code, to allow linking C and C++ object code together. (If you use '-nostdlib', you get an unresolved reference to __main, since it's defined in the standard GCC library. Include '-lgcc' at the end of your compiler command line to resolve this reference.)

I'm not sure what exactly IDA Pro is showing in your examples. IDA Pro labels what it's showing as start not main so I'd guess that JimR's answer is right - it's probably the runtime's initialization (perhaps the entry point as described in the .exe header - which is not main(), but the runtime initialization entry point).

Does IDA Pro understand gcc's debug symbols? Did you compile with the -g option so the debug symbols are generated?


It looks like the Borland compiler is recognizing that you never actually do anything with a and is just giving you the equivalent assembly for an empty main function.


Difference here is mosly not in compiled code, but in what disassembler shows to you. You may think that main is the only function in your program but it is not. In fact your program is something like this:

void start()
{
    ... some initialization code here
    int result = main();
    ... some deinitialization code here
    ExitProcess(result);
}

IDA Pro knows how Borland works, so it can navigate directly to your main, but it doesn't know how gcc works so it shows you the true entry point of your program. You can see in Borland ASM that main is called from some other function. In GCC ASM you can go thru all of these sub_40xxx to find your main

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜