Curious Performance difference between returning a value or returning through an Action<T> parameter
I was curious to see what the performance differences between returning a value from a method, or returning it through an Action parameter.
There is a somewhat related question to this Performance of calling delegates vs methods
But for the life of me I can't explain why returning a value would be ~30% slower than calling a delegate to return the value. Is the .net Jitter (not compiler..) in-lining my simple delegate (I didn't think it did that)?
class Program
{
static void Main(string[] args)
{
Stopwatch开发者_如何学Go sw = new Stopwatch();
sw.Start();
A aa = new A();
long l = 0;
for( int i = 0; i < 100000000; i++ )
{
aa.DoSomething( i - 1, i, r => l += r );
}
sw.Stop();
Trace.WriteLine( sw.ElapsedMilliseconds + " : " + l );
sw.Reset();
sw.Start();
l = 0;
for( int i = 0; i < 100000000; i++ )
{
l += aa.DoSomething2( i - 1, i );
}
sw.Stop();
Trace.WriteLine( sw.ElapsedMilliseconds + " : " + l );
}
}
class A
{
private B bb = new B();
public void DoSomething( int a, int b, Action<long> result )
{
bb.Add( a,b, result );
}
public long DoSomething2( int a, int b )
{
return bb.Add2( a,b );
}
}
class B
{
public void Add( int a, int b, Action<long> result )
{
result( a + b );
}
public long Add2( int i, int i1 )
{
return i + i1;
}
}
I made a couple of changes to your code.
- Moved
new A()
before the timed section. - Added warmup code before the timed section to get the methods JIT'ed.
- Created an
Action<long>
reference before the timed section and loop so that it does not have to be created on each iteration. This one seemed to have a big impact on execution time.
Here are my results after making the above changes. The vshost column indicates whether the code was executing inside the vshost.exe process (by running directly from Visual Studio). I was using Visual Studio 2008 and targeted .NET 3.5 SP1.
vshost? Debug Release
-------------------------
YES 6405 3827
11059 3092
NO 4214 1691
4607 811
Notice how you get different results depending on the build configuration and the execution environment. The results are interesting if nothing else. If I get time I might edit my answer to provide a theory.
Strangely, I'm not seeing the behavior you're describing when running a Release build in VS. I am seeing it when running a Debug build. The only thing I can figure is that there's added overhead with the return-based approach when running the Debug build, though I'm not clever enough to see why.
Here's something else that's interesting: this discrepancy disappears when I switch to a x64 build (Release or Debug).
If I were to venture a guess (completely unsubstantiated), it might be that the cost of passing the 64-bit long
as a return value in both B.Add2
and A.DoSomething2
outweighs that of passing the Action<long>
in a 32-bit environment. In a 64-bit environment, this savings would vanish as the Action<long>
would require 64 bits as well. In a Release build in either configuration, the cost of passing the long
probably disappears as both B.Add2
and A.DoSomething2
seem like prime candidates for inlining.
Somebody who knows way more about this than I do: feel free to totally refute everything I just said. We're all here to learn, after all ;)
Well for starters your call to new A()
is being timed the way you currently have your code set up. You need to make sure you're running in release mode with optimizations on as well. Also you need to take the JIT into account--prime all your code paths so you can guarantee they are compiled before you time them (unless you are concerned about start-up time).
I see an issue when you try to time a large quantity of primitive operations (the simple addition). In this case you can't make any definitive conclusions since any overhead will completely dominate your measurements.
edit: In release mode targeting .NET 3.5 in VS2008 I get:
1719 : 9999999800000000
1337 : 9999999800000000
Which seems to be consistent with many of the other answers. Using ILDasm gives the following IL for B.Add:
IL_0000: ldarg.3
IL_0001: ldarg.1
IL_0002: ldarg.2
IL_0003: add
IL_0004: conv.i8
IL_0005: callvirt instance void class [mscorlib]System.Action`1<int64>::Invoke(!0)
IL_000a: ret
Where B.Add2 is:
IL_0000: ldarg.1
IL_0001: ldarg.2
IL_0002: add
IL_0003: conv.i8
IL_0004: ret
So it looks as though you're pretty much just timing a load
and callvirt
.
Why not use reflector to find out?
精彩评论