Using P/invoke to improve performance, feasible or just wishful thinking?
This is a question I probably should have asked sooner but didn't in my rush to have fun with p/invoke type stuff in MonoTouch.
Basically I have a problem with performance pertaining to a very large number of floating point operations, specifically ones involving min/max functions, vector multiplication and simlar stuff (basically detecting if different kinds of shapes are intersecting or not).
The reason for these operations is because of a 2D physics engine written in C#.
On some platforms such as Windows Phone 7 and the Xbox 360 the physics engine runs without any hitches, it steals some CPU cycles but leaves plenty to ensure the game is running at a steady framerate.
The problem is in MonoTouch running on the iPhone. It appears MonoToch is not all that great with so many floating point operations and the iPhone (and EVEN the iPad 2) finds itself being severely impacted and physics being the obvious performance bottleneck. I've profiled the performance and it comes down to a set of relatively basic math functions as I mentioned before and there's no real way of optimizing those functions, the physics engine itself is very well written and I can't see any obvious place it's lagging in and frankly I doubt there's anything wrong with it as a 2D C# physics engine.
To that end I've resolved to find a physics engine written in C (or C++ if possible) and hook it up with the main MonoTouch application. My reasoning is that since the performance issues in MonoTouch probably have something to do with the fact the MonoTouch compiler does not compile .net code to run as fast as the Wp7/xbox 360 JIT compilers do (which is understandable) moving things out of monotouch and running them natively would help improve performance.
So my idea is tha开发者_如何学JAVAt I'll use Box2D, write a bunch of static wrapper functions (such as CreateWorld(), CreateBox(), GetBodyPosition(int id), etc, etc) and hook all that in via p/invoke functionality and integrate it all into my physics wrapper class, in this way the core game logic will require minimal to no modifications and I can maintain the integrity of the original code design but also gain performance boosts due to the fact the physics is running in native C.
But this got me thinking, the performance issues are stemming from very simple and straightforward mathematical functions, simple multiplications and size comparisons. If running functions via p/invoke wiould improve speed then simply re-writing a function such as Vector2.Max as a C function and invoking that also improve performance?
This however seems a bit farfetched, if it was the case wouldn't Mono do that anyway?
So I guess my overall question is, do statically linked native libraries perform better when called from p/invoke than the equivalent C# function compiled by MonoTouch?
There is really only one way to figure out if it's faster: benchmark your case. It might be that the c compiler you'd use for the native library is able to optimize more than what mono's jit does, or it might be just the cpu which isn't suited well for your particular workload.
If you want to try out using a native library, have in mind that you do not want to call native code from managed often, there is a significant performance hit for each managed to native transition (how significant depends on a lot of factors, but fewer transitions is always better).
That said, my personal guess would be that the mono jit doesn't handle floating point operations as well as it could (there have been issues with this before on some cpus, I don't remember now if those issues still exist nor on which cpus), so you would benefit from using a native library.
If you have som critical sections of your code you could try to use "unsafe" to avoid the automatic range checking for array access. For short arrays such as 2- or 3- vectors and simple operations (like norm or dot product) it may have an effect.
Simple mathematical operations should not be slower, but it is hard to tell which parts of the code are slower without profiling results. If the native version uses sse heavily or if your managed version unintentionally uses a lot of boxing etc. that may cause the discrepancy.
For P/invoke to be useful you should do a few calls that do a lot of work, so it may be useful if you can update the whole state in one call to the native dll. But I'd definitely not try to make trivial functions native.
In addition to Rolf's great response, you might consider using the LLVM optimization when building your game.
I find it surprising that Windows Phone 7 has a better JIT than Mono's static compiler
精彩评论