Advice needed for a physics engine
I've recently started a project, building a p开发者_开发技巧hysics engine. I was hoping you could give me some advice related to some documentation and/or best technologies for this.
First of all, I've seen that Game-Physics-Engine-Development is highly recommended for the task at hand, and I was wondering if you could give me a second opinion.Should I get it? Also, while browsing Amazon, I've stumbled onto Game Engine Architecture and since I want to build my physics engine for games, I thought this might be a good read aswell.
Second, I know that simulating physics is highly computation intensive so I would like to use either CUDA or OpenCL.Right now I'm leaning towards OpenCL, because it would work on both NVIDIA and ATI chipsets.What do you guys suggest?
PS: I will be implementing this in C++ on Linux.
Thanks.
I would suggest first of all planning a simple game as a test case for your engine. Having a basic game will drive feature and API development. Writing an engine without having clear goal makes the project riskier. While I agree nVidia and ATi should be treated as separate targets for performance reasons, I'd recommend you start with neither.
I personally wrote physics engine for Uncharted:Drake's Fortune - a PS3 game - and I did a pass in C++, and when it worked, made a pass to optimize it for VMX and then put it on SPU. Mind you, I did just a fraction what I wanted to do initially because of time constraints. After that I made an iteration to split data stages out and formulate a pipeline of data transformations. It's important because whether CPU, GPU or SPU, modern processors running nontrivial code spend most of the time waiting for caches. You have to pay special attention to data structures and pipeline them such that you have a small working set of data at any stage. E.g. first I do broadphase, so I don't need shapes but I need world-space bounding boxes. So I split bounding boxes into separate array and compute them all together in another pass, that writes them out in an optimal way. As input to bbox computation, I need shape transformations and some bounds from them, but not necessarily the whole shapes. After broadphase, I compute/update sim islands, at the same time performing narrow phase, for which I do actually need the shapes. And so on - I described this with pictures in an article to Game Physics Pearls I wrote.
I guess what I'm trying to say are the following points:
- Make sure you have a clear goal that drives your development - a very basic game with flushed out design would be best in game physics engine case.
- Don't try to optimize before you have a working product. Write it in the simplest and fastest way possible first and fix all the bugs in math. Design it so that you can port it to CUDA later, but don't start writing CUDA kernels before you have boxes rolling on the screen.
- After you write the first pass in C++, optimize it for CPU : streamline it such that it doesn't thrash the cache, and compartmentalize the code so that there's no spaghetti of calls and all the code from each stage is localized. This will help a) port to CUDA b) port to OpenCL c) port to a console d) make it run reasonably fast e) make it possible to debug.
- While developing, resist temptation to go do something you just thought about unless that feature is not necessary for your clear goal (see #1) - that's why you need a goal, to steer you towards it and make it possible to finish the actual project. Distractions usually kill projects without clear goals.
- Remember that in one way or another, software development is iterative. It's ok to do a rough-in and then refine it. Leather, rinse, repeat - it's a mantra of a programmer :)
It's easy to give advice. If you wanna do something, just go and do it, and we'll sit back and critique :)
Here is an answer regarding the choice of CUDA or OpenCL. I do not have a recommendation for a book.
If you want to run your program on both NVIDIA and ATI chipsets, then OpenCL will make the job easier. However, you will want to write a different version of each kernel to get good performance on each chipset. For example, on ATI cards you'll want to manually vectorize code using float4/int4 data types (or accept a nearly 4x performance penalty), while NVIDIA works better with scalar data types.
If you're only targeting NVIDIA, then CUDA is somewhat more convenient to program in.
精彩评论