How does the C# compiler optimize a code fragment?

2022-12-15 00:36 问答作者：

If I have a code like this

for(int i=0;i<10;i++)
{
    int iTemp;
    iTemp = i;
    //.........
}

Does the compiler instantinate iTemp 10 times?

Or it optimize it?

I mean if i rewrite the loop as

开发者_JAVA百科int iTemp;
for(int i=0;i<10;i++)
{
    iTemp = i;
    //.........
}

Will it be faster?

Using reflector you can view the IL generated by the C# compiler.

.method private hidebysig static void Way1() cil managed
{
    .maxstack 2
    .locals init (
        [0] int32 i)
    L_0000: ldc.i4.0 
    L_0001: stloc.0 
    L_0002: br.s L_0008
    L_0004: ldloc.0 
    L_0005: ldc.i4.1 
    L_0006: add 
    L_0007: stloc.0 
    L_0008: ldloc.0 
    L_0009: ldc.i4.s 10
    L_000b: blt.s L_0004
    L_000d: ret 
}

.method private hidebysig static void Way2() cil managed
{
    .maxstack 2
    .locals init (
        [0] int32 i)
    L_0000: ldc.i4.0 
    L_0001: stloc.0 
    L_0002: br.s L_0008
    L_0004: ldloc.0 
    L_0005: ldc.i4.1 
    L_0006: add 
    L_0007: stloc.0 
    L_0008: ldloc.0 
    L_0009: ldc.i4.s 10
    L_000b: blt.s L_0004
    L_000d: ret 
}

They're exactly the same so it makes no performance difference where you declare iTemp.

As others have said, the code you've shown produces equivalent IL, except when the variable is captured by a lambda expression for later execution. In that case the code is different as it must keep track of the current value of the variable for the expression. There may be other instances where the optimization doesn't take place as well.

Creating a fresh copy of the loop variable is a common technique when you want to capture the value for a lambda expression.

Try:

var a = new List<int> { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

var q = a.AsEnumerable();
int iTemp;
for(int i=0;i<10;i++) 
{ 
    iTemp = i;
    q = q.Where( x => x <= iTemp );
}

Console.WriteLine(string.Format( "{0}, count is {1}",
    string.Join( ":", q.Select( x => x.ToString() ).ToArray() ),
    q.Count() ) );

and

var a = new List<int> { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

var q = a.AsEnumerable();
for(int i=0;i<10;i++) 
{ 
    var iTemp = i;
    q = q.Where( x => x <= iTemp );
}

Console.WriteLine(string.Format( "{0}, count is {1}",
    string.Join( ":", q.Select( x => x.ToString() ).ToArray() ),
    q.Count() ) );

If you're really curious about how CSC (the C# compiler) treats your code, you might want to play with LINQPad- it allows you to, among other things, enter short C# expressions or programs and take a look at the resulting IL (CLR bytecode).

One thing to keep in mind is that local variables are typically allocated on the stack. One task that a compiler must do is figure out how much stack space a particular method requires and set that aside.

Consider:

int Func(int a, int b, int c)
{
    int x = a * 2;
    int y = b * 3;
    int z = c * 4;
    return x + y + z;
 }

Ignoring the fact that this can be easily optimized to be return (a * 2) + (b * 3) + (c * 4), the compiler is going to see three local variables and set aside room for three local variables.

If I have this:

int Func(int a, int b, int c)
{
    int x = a * 2;
    {
        int y = b * 3;
        {
            int z = c * 4;
            {
                return x + y + z;
            }
        }
     }
 }

It's still the same 3 local variables - just in different scopes. A for loop is nothing but a scope block with a little glue code to make it work.

Now consider this:

int Func(int a, int b, int c)
{
    int x = a * 2;
    {
        int y = b * 3;
        x += y;
    }
    {
        int z = c * 4;
        x += z;
    }
    return x;
}

This is the only case where it could be different. You have variables y and z which go in and out of scope - once they are out of scope, the stack space is no longer needed. The compiler could choose to reuse those slots such that y and z share the same space. As optimizations go, it's simple but it doesn't gain much - it saves some space, which might be important on embedded systems, but not in most .NET applications.

As a side note, the C# compiler in VS2008 in release isn't even performing the simplest strength reductions. The IL for the first version is this:

L_0000: ldarg.0 
L_0001: ldc.i4.2 
L_0002: mul 
L_0003: stloc.0 
L_0004: ldarg.1 
L_0005: ldc.i4.3 
L_0006: mul 
L_0007: stloc.1 
L_0008: ldarg.2 
L_0009: ldc.i4.4 
L_000a: mul 
L_000b: stloc.2 
L_000c: ldloc.0 
L_000d: ldloc.1 
L_000e: add 
L_000f: ldloc.2 
L_0010: add 
L_0011: ret

whereas, I fully expected to see this:

L_0000: ldarg.0 
L_0001: ldc.i4.2 
L_0002: mul 
L_0003: ldarg.1 
L_0004: ldc.i4.3 
L_0005: mul 
L_0006: add 
L_0007: ldarg.2 
L_0008: ldc.i4.4 
L_0009: mul 
L_000a: add 
L_000b: ret

The compiler will do the optimisation you've shown for you.

It's a simple form of loop hoisting.

A lot of people have provided you IL to show you that your two code fragments are effectively the same from a performance perspective. It's not really necessary to go to that level of detail to see why this is the case. Just think about this from the perspective of the call stack.

Effectively what happens at the beginning of a method containing a code fragment like the two that you provided is that the compiler will emit code to allocate space at the beginning of the method for all locals that will be used within that method.

In both cases what the compiler sees is a local named iTemp so when it allocates space on the stack for the locals it will allocate 32-bits to hold iTemp. It doesn't matter to the compiler that in the two code fragments iTemp have different scope; the compiler will enforce that by just not allowing you to refer to iTemp outside the for loop in the first fragment. What it will do is allocate this space once (at the beginning of the method) and reuse the space as needed during the loop in the first fragment.

The C# compiler doesn't always need to do a good job. The JIT optimizer is tuned for the IL that the C# compiler emits, better looking IL does not (necessarily) produce better looking machine code.

Let's take an earlier example:

static int Func(int a, int b, int c)
{
    int x = a * 2;
    int y = b * 3;
    int z = c * 4;
    return x + y + z;
}

The emitted IL from the 3.5 compiler with optimizations enabled looks like this:

.method private hidebysig static int32  Func(int32 a,
                                             int32 b,
                                             int32 c) cil managed
{
  // Code size       18 (0x12)
  .maxstack  2
  .locals init (int32 V_0,
           int32 V_1,
           int32 V_2)
  IL_0000:  ldarg.0
  IL_0001:  ldc.i4.2
  IL_0002:  mul
  IL_0003:  stloc.0
  IL_0004:  ldarg.1
  IL_0005:  ldc.i4.3
  IL_0006:  mul
  IL_0007:  stloc.1
  IL_0008:  ldarg.2
  IL_0009:  ldc.i4.4
  IL_000a:  mul
  IL_000b:  stloc.2
  IL_000c:  ldloc.0
  IL_000d:  ldloc.1
  IL_000e:  add
  IL_000f:  ldloc.2
  IL_0010:  add
  IL_0011:  ret
} // end of method test::Func

Not very optimal right? I'm compiling it into an executable, calling it from a simple Main method and the compiler isn't inlining it or doing any optimizations really.

So what is happening at runtime?

The JIT is in fact inlining the call to Func() and producing much better code than you might imagine when looking at the stack-based IL up above:

mov     edx,dword ptr [rbx+10h]
mov     eax,1
cmp     rax,rdi
jae     000007ff`00190265

mov     eax,dword ptr [rbx+rax*4+10h]
mov     ecx,2
cmp     rcx,rdi
jae     000007ff`00190265

mov     ecx,dword ptr [rbx+rcx*4+10h]
add     edx,edx
lea     eax,[rax+rax*2]
shl     ecx,2
add     eax,edx
lea     esi,[rax+rcx]

继续阅读：compiler-construction optimization

How does the C# compiler optimize a code fragment?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？