disassembly of C#: why is DUMPBIN native code so different from Debug's Disassembly?
Suppose this is my program simpleCsharp.exe:
namespace simpleCsharp
{
public class Program
{
public static int Main(string[] args)
{
uint x = 0xFEFEFE;
uint y = 0xEEEEEE;
uint z;
uint[] list = { 0, 1, 2, 4, 8 };
uint[] array = { 0xA, 0xB, 0xC, 0xD };
z = x + y + list[2] + array[1];
z = z - (y << 1);
return 0;
}
}
}
If I view the disassembly of a simple C# program in Debug's Disassembly window, the native code output at least makes some sense. For example, here is the Debug's disassembly of Main, with Optimization on:
uint x = 0xFEFEFE;
00000000 push ebp
00000001 mov ebp,esp
00000003 sub esp,28h
00000006 xor eax,eax
00000008 mov dword ptr [ebp-14h],eax
0000000b mov dword ptr [ebp-18h],eax
0000000e mov dword ptr [ebp-4],ecx
00000011 cmp dword ptr ds:[037D14ACh],0
00000018 je 0000001F
0000001a call 763B370F
0000001f xor edx,edx
00000021 mov dword ptr [ebp-0Ch],edx
00000024 xor edx,edx
00000026 mov dword ptr [ebp-1Ch],edx
00000029 xor edx,edx
0000002b mov dword ptr [ebp-20h],edx
0000002e xor edx,edx
00000030 mov dword ptr [ebp-8],edx
00000033 xor edx,edx
00000035 mov dword ptr [ebp-10h],edx
00000038 mov dword ptr [ebp-8],0FEFEFEh
uint y = 0xEEEEEE;
0000003f mov dword ptr [ebp-0Ch],0EEEEEEh
uint z;
uint[] list = { 0, 1, 2, 4, 8 };
00000046 mov edx,5
0000004b mov ecx,79882916h
00000050 call FD95FD70
00000055 mov dword ptr [ebp-24h],eax
00000058 lea ecx,[ebp-14h]
0000005b mov edx,37D25E0h
00000060 call 761A4716
00000065 lea eax,[ebp-14h]
00000068 push dword ptr [eax]
0000006a mov ecx,dword ptr [ebp-24h]
0000006d call 761A47F3
00000072 mov eax,dword ptr [ebp-24h]
00000075 mov dword ptr [ebp-1Ch],eax
uint[] array = { 0xA, 0xB, 0xC, 0xD };
00000078 mov edx,4
0000007d mov ecx,79882916h
00000082 call FD95FD70
00000087 mov dword ptr [ebp-28h],eax
0000008a lea ecx,[ebp-18h]
0000008d mov edx,37D25ECh
00000092 call 761A4716
00000097 lea eax,[ebp-18h]
0000009a push dword ptr [eax]
0000009c mov ecx,dword ptr [ebp-28h]
0000009f call 761A47F3
000000a4 mov eax,dword ptr [ebp-28h]
000000a7 mov dword ptr [ebp-20h],eax
z = x + y + list[2] + array[1];
000000aa mov eax,dword ptr [ebp-8]
000000ad add eax,dword ptr [ebp-0Ch]
000000b0 mov edx,dword ptr [ebp-1Ch]
000000b3 cmp dword ptr [edx+4],2
000000b7 ja 000000BE
000000b9 call 763B6900
000000be add eax,dword ptr [edx+10h]
000000c1 mov edx,dword ptr [ebp-20h]
000000c4 cmp dword ptr [edx+4],1
000000c8 ja 000000CF
000000ca call 763B6900
000000cf add eax,dword ptr [edx+0Ch]
000000d2 mov dword ptr [开发者_运维百科ebp-10h],eax
z = z - (y << 1);
000000d5 mov eax,dword ptr [ebp-0Ch]
000000d8 add eax,eax
000000da sub dword ptr [ebp-10h],eax
return 0;
000000dd xor eax,eax
000000df mov esp,ebp
000000e1 pop ebp
000000e2 ret
However, if I run DUMPBIN on the same C# assembly (with Debug Info = "None" so it doesn't just show bytes), i.e.
dumpbin "simpleCsharp.exe" /disasm /out:"simpleCsharp_dump.txt"
the native code output in the generated file doesn't even closely resemble what I viewed in Debug's Disassembly. I don't see even a single instruction or value from the Debug's Disassembly in the file from dumpbin. So the 2 lines of native code (above) are nowhere to be found. This is the case whether I run dumpbin on the assembly generated from Visual Studio (2010) or I use ngen.exe to generate a native image, and run dumpbin on the native image file simpleCsharp.ni.exe.
Optimization is on in Debug, and build is set to Release, the only difference between the assembly I run Debug on, and the assembly I give to ngen, is Debug Info = "None".
dumpbin simpleCsharp.ni.exe /disasm
Here is the disassembly of the simpleCsharp program when I run dumpbin on the native image file:
https://docs.google.com/leaf?id=0B9u9yFU99BOcYjNmNGRmNTItZjQ0NC00YmI0LWEyZTQtNjdkNDdhYTc2MmNm&hl=en
I would at least expect to see the number FEFEFE or EEEEEE show up in the output of dumpbin somewhere, and it does show up in Debug Disassembly.
Could someone please explain why I don't see one line of Debug's disassembly code in the dumpbin output from the native image file, for the same program? If it's because of optimization, would you mind giving a little detail?
Thanks
You are forgetting about the just-in-time compiler. An assembly doesn't contain machine code, it is generated at runtime by the jitter from the IL in the assembly. You can look at the IL in the assembly with tools like ildasm.exe or Reflector. Dumpbin.exe has poor support, it can dump the CLR header, that's about it.
Beware that the ngen-ed image contains machine code that was optimized by the jitter. That optimizer alters the machine code a great deal. Optimization is off by default in the debugger. To see it, you have to debug the Release build and change a debugger option. Tools + Options, Debugging, General, untick the "Suppress JIT optimization on module load" option. Also beware that the generated code can be just plain different in places because it was precompiled instead of jitted. The jitter can do a better job because it has knowledge that isn't available up front.
精彩评论