开发者

Curious Performance difference between returning a value or returning through an Action<T> parameter

I was curious to see what the performance differences between returning a value from a method, or returning it through an Action parameter.

There is a somewhat related question to this Performance of calling delegates vs methods

But for the life of me I can't explain why returning a value would be ~30% slower than calling a delegate to return the value. Is the .net Jitter (not compiler..) in-lining my simple delegate (I didn't think it did that)?

class Program
{
    static void Main(string[] args)
    {
        Stopwatch开发者_如何学Go sw = new Stopwatch();
        sw.Start();

        A aa = new A();

        long l = 0;
        for( int i = 0; i < 100000000; i++ )
        {
            aa.DoSomething( i - 1, i, r => l += r );
        }

        sw.Stop();
        Trace.WriteLine( sw.ElapsedMilliseconds + " : " + l );

        sw.Reset();
        sw.Start();

        l = 0;
        for( int i = 0; i < 100000000; i++ )
        {
            l += aa.DoSomething2( i - 1, i );
        }

        sw.Stop();
        Trace.WriteLine( sw.ElapsedMilliseconds + " : " + l );
    }
}
class A
{
    private B bb = new B();

    public void DoSomething( int a, int b, Action<long> result )
    {
        bb.Add( a,b, result );
    }

    public long DoSomething2( int a, int b  )
    {
        return bb.Add2( a,b );
    }

}
class B
{
    public void Add( int a, int b, Action<long> result )
    {
        result( a + b );
    }

    public long Add2( int i, int i1 )
    {
        return i + i1;
    }
}


I made a couple of changes to your code.

  • Moved new A() before the timed section.
  • Added warmup code before the timed section to get the methods JIT'ed.
  • Created an Action<long> reference before the timed section and loop so that it does not have to be created on each iteration. This one seemed to have a big impact on execution time.

Here are my results after making the above changes. The vshost column indicates whether the code was executing inside the vshost.exe process (by running directly from Visual Studio). I was using Visual Studio 2008 and targeted .NET 3.5 SP1.

vshost?   Debug   Release
-------------------------
 YES       6405     3827
          11059     3092

 NO        4214     1691
           4607      811

Notice how you get different results depending on the build configuration and the execution environment. The results are interesting if nothing else. If I get time I might edit my answer to provide a theory.


Strangely, I'm not seeing the behavior you're describing when running a Release build in VS. I am seeing it when running a Debug build. The only thing I can figure is that there's added overhead with the return-based approach when running the Debug build, though I'm not clever enough to see why.

Here's something else that's interesting: this discrepancy disappears when I switch to a x64 build (Release or Debug).

If I were to venture a guess (completely unsubstantiated), it might be that the cost of passing the 64-bit long as a return value in both B.Add2 and A.DoSomething2 outweighs that of passing the Action<long> in a 32-bit environment. In a 64-bit environment, this savings would vanish as the Action<long> would require 64 bits as well. In a Release build in either configuration, the cost of passing the long probably disappears as both B.Add2 and A.DoSomething2 seem like prime candidates for inlining.

Somebody who knows way more about this than I do: feel free to totally refute everything I just said. We're all here to learn, after all ;)


Well for starters your call to new A() is being timed the way you currently have your code set up. You need to make sure you're running in release mode with optimizations on as well. Also you need to take the JIT into account--prime all your code paths so you can guarantee they are compiled before you time them (unless you are concerned about start-up time).

I see an issue when you try to time a large quantity of primitive operations (the simple addition). In this case you can't make any definitive conclusions since any overhead will completely dominate your measurements.

edit: In release mode targeting .NET 3.5 in VS2008 I get:

1719 : 9999999800000000
1337 : 9999999800000000

Which seems to be consistent with many of the other answers. Using ILDasm gives the following IL for B.Add:

  IL_0000:  ldarg.3
  IL_0001:  ldarg.1
  IL_0002:  ldarg.2
  IL_0003:  add
  IL_0004:  conv.i8
  IL_0005:  callvirt   instance void class [mscorlib]System.Action`1<int64>::Invoke(!0)
  IL_000a:  ret

Where B.Add2 is:

  IL_0000:  ldarg.1
  IL_0001:  ldarg.2
  IL_0002:  add
  IL_0003:  conv.i8
  IL_0004:  ret

So it looks as though you're pretty much just timing a load and callvirt.


Why not use reflector to find out?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜