开发者

Can you program a pure GPU game?

I'm a CS master student, and next semester I will have to sta开发者_如何学JAVArt working on my thesis. I've had trouble coming up with a thesis idea, but I decided it will be related to Computer Graphics as I'm passionate about game development and wish to work as a professional game programmer one day.

Unfortunately I'm kinda new to the field of 3D Computer Graphics, I took an undergraduate course on the subject and hope to take an advanced course next semester, and I'm already reading a variety of books and articles to learn more. Still, my supervisor thinks its better if I come up with a general thesis idea now and then spend time learning about it in preparation for doing my thesis proposal. My supervisor has supplied me with some good ideas but I'd rather do something more interesting on my own, which hopefully has to do with games and gives me more opportunities to learn more about the field. I don't care if it's already been done, for me the thesis is more of an opportunity to learn about things in depth and to do substantial work on my own.

I don't know much about GPU programming and I'm still learning about shaders and languages like CUDA. One idea I had is to program an entire game (or as much as possible) on the GPU, including all the game logic, AI, and tests. This is inspired by reading papers on GPGPU and questions like this one I don't know how feasible that is with my knowledge, and my supervisor doesn't know a lot about recent GPUs. I'm sure with time I will be able to answer this question on my own, but it'd be handy if I could know the answer in advance so I could also consider other ideas.

So, if you've got this far, my question: Using only shaders or something like CUDA, can you make a full, simple 3D game that exploits the raw power and parallelism of GPUs? Or am I missing some limitation or difference between GPUs and CPUs that will always make a large portion of my code bound to CPU? I've read about physics engines running on the GPU, so why not everything else?


DISCLAIMER: I've done a PhD, but have never supervised a student of my own, so take all of what I'm about to say with a grain of salt!

I think trying to force as much of a game as possible onto a GPU is a great way to start off your project, but eventually the point of your work should be: "There's this thing that's an important part of many games, but in it's present state doesn't fit well on a GPU: here is how I modified it so it would fit well".

For instance, fortran mentioned that AI algorithms are a problem because they tend to rely on recursion. True, but, this is not necessarily a deal-breaker: the art of converting recursive algorithms into an iterative form is looked upon favorably by the academic community, and would form a nice center-piece for your thesis.

However, as a masters student, you haven't got much time so you would really need to identify the kernel of interest very quickly. I would not bother trying to get the whole game to actually fit onto the GPU as part of the outcome of your masters: I would treat it as an exercise just to see which part won't fit, and then focus on that part alone.

But be careful with your choice of supervisor. If your supervisor doesn't have any relevant experience, you should pick someone else who does.


I'm still waiting for a Gameboy Emulator that runs entirely on the GPU, which is just fed the game ROM itself and current user input and results in a texture displaying the game - maybe a second texture for sound output :)

The main problem is that you can't access persistent storage, user input or audio output from a GPU. These parts have to be on the CPU, by definition (even though cards with HDMI have audio output, but I think you can't control it from the GPU). Apart from that, you can already push large parts of the game code into the GPU, but I think it's not enough for a 3D game, since someone has to feed the 3D data into the GPU and tell it which shaders should apply to which part. You can't really randomly access data on the GPU or run arbitrary code, someone has to do the setup.

Some time ago, you would just setup a texture with the source data, a render target for the result data, and a pixel shader that would do the transformation. Then you rendered a quad with the shader to the render target, which would perform the calculations, and then read the texture back (or use it for further rendering). Today, things have been made simpler by the fourth and fifth generation of shaders (Shader Model 4.0 and whatever is in DirectX 11), so you can have larger shaders and access memory more easily. But still they have to be setup from the outside, and I don't know how things are today regarding keeping data between frames. In worst case, the CPU has to read back from the GPU and push again to retain game data, which is always a slow thing to do. But if you can really get to a point where a single generic setup/rendering cycle would be sufficient for your game to run, you could say that the game runs on the GPU. The code would be quite different from normal game code, though. Most of the performance of GPUs comes from the fact that they execute the same program in hundreds or even thousands of parallel shading units, and you can't just write a shader that can draw an image to a certain position. A pixel shader always runs, by definition, on one pixel, and the other shaders can do things on arbitrary coordinates, but they don't deal with pixels. It won't be easy, I guess.

I'd suggest just trying out the points I said. The most important is retaining state between frames, in my opinion, because if you can't retain all data, all is impossible.


First, Im not a computer engineer so my assumptions cannot even be a grain of salt, maybe nano scale.

  • Artificial intelligence? No problem.There are countless neural network examples running in parallel in google. Example: http://www.heatonresearch.com/encog
  • Pathfinding? You just try some parallel pathfinding algorithms that are already on internet. Just one of them: https://graphics.tudelft.nl/Publications-new/2012/BB12a/BB12a.pdf
  • Drawing? Use interoperability of dx or gl with cuda or cl so drawing doesnt cross pci-e lane. Can even do raytracing at corners so no z-fighting anymore, even going pure raytraced screen is doable with mainstream gpu using a low depth limit.
  • Physics? The easiest part, just iterate a simple Euler or Verlet integration and frequently stability checks if order of error is big.
  • Map/terrain generation? You just need a Mersenne-twister and a triangulator.
  • Save game? Sure, you can compress the data parallelly before writing to a buffer. Then a scheduler writes that data piece by piece to HDD through DMA so no lag.
  • Recursion? Write your own stack algorithm using main vram, not local memory so other kernels can run in wavefronts and GPU occupation is better.
  • Too much integer needed? You can cast to a float then do 50-100 calcs using all cores then cast the result back to integer.
  • Too much branching? Compute both cases if they are simple, so every core is in line and finish in sync. If not, then you can just put a branch predictor of yourself so the next time, it predicts better than the hardware(could it be?) with your own genuine algorithm.
  • Too much memory needed? You can add another GPU to system and open DMA channel or a CF/SLI for faster communication.
  • Hardest part in my opinion is the object oriented design since it is very weird and hardware dependent to build pseudo objects in gpu. Objects should be represented in host(cpu) memory but they must be separated over many arrays in gpu to be efficient. Example objects in host memory: orc1xy_orc2xy_orc3xy. Example objects in gpu memory: orc1_x__orc2_x__ ... orc1_y__orc2_y__ ...


The answer has already been chosen 6 years ago but for those interested to the actual question, Shadertoy, a live-coding WebGL platform, recently added the "multipass" feature allowing preservation of state.

Here's a live demo of the Bricks game running on Gpu.


I don't care if it's already been done, for me the thesis is more of an opportunity to learn about things in depth and to do substantial work on my own.

Then your idea of what a thesis is is completely wrong. A thesis must be an original research. --> edit: I was thinking about a PhD thesis, not a master thesis ^_^

About your question, the GPU's instruction sets and capabilities are very specific to vector floating point operations. The game logic usually does little floating point, and much logic (branches and decision trees).

If you take a look to the CUDA wikipedia page you will see:

It uses a recursion-free, function-pointer-free subset of the C language

So forget about implementing any AI algorithms there, that are essentially recursive (like A* for pathfinding). Maybe you could simulate the recursion with stacks, but if it's not allowed explicitly it should be for a reason. Not having function pointers also limits somewhat the ability to use dispatch tables for handling the different actions depending on state of the game (you could use again chained if-else constructions, but something smells bad there).

Those limitations in the language reflect that the underlying HW is mostly thought to do streaming processing tasks. Of course there are workarounds (stacks, chained if-else), and you could theoretically implement almost any algorithm there, but they will probably make the performance suck a lot.

The other point is about handling the IO, as already mentioned up there, this is a task for the main CPU (because it is the one that executes the OS).


It is viable to do a masters thesis on a subject and with tools that you are, when you begin, unfamiliar. However, its a big chance to take!

Of course a masters thesis should be fun. But ultimately, its imperative that you pass with distinction and that might mean tackling a difficult subject that you have already mastered.

Equally important is your supervisor. Its imperative that you tackle some problem they show an interest in - that they are themselves familiar with - so that they can become interested in helping you get a great grade.

You've had lots of hobby time for scratching itches, you'll have lots more hobby time in the future too no doubt. But master thesis time is not the time for hobbies unfortunately.


Whilst GPUs today have got some immense computational power, they are, regardless of things like CUDA and OpenCL limited to a restricted set of uses, whereas the CPU is more suited towards computing general things, with extensions like SSE to speed up specific common tasks. If I'm not mistaken, some GPUs have the inability to do a division of two floating point integers in hardware. Certainly things have improved greatly compared to 5 years ago.

It'd be impossible to develop a game to run entirely in a GPU - it would need the CPU at some stage to execute something, however making a GPU perform more than just the graphics (and physics even) of a game would certainly be interesting, with the catch that game developers for PC have the biggest issue of having to contend with a variety of machine specification, and thus have to restrict themselves to incorporating backwards compatibility, complicating things. The architecture of a system will be a crucial issue - for example the Playstation 3 has the ability to do multi gigabytes a second of throughput between the CPU and RAM, GPU and Video RAM, however the CPU accessing GPU memory peaks out just past 12MiB/s.


The approach you may be looking for is called "GPGPU" for "General Purpose GPU". Good starting points may be:

  • http://en.wikipedia.org/wiki/GPGPU
  • http://gpgpu.org/

Rumors about spectacular successes in this approach have been around for a few years now, but I suspect that this will become everyday practice in a few years (unless CPU architectures change a lot, and make it obsolete).

The key here is parallelism: if you have a problem where you need a large number of parallel processing units. Thus, maybe neural networks or genetic algorithms may be a good range of problems to attack with the power of a GPU. Maybe also looking for vulnerabilities in cryptographic hashes (cracking the DES on a GPU would make a nice thesis, I imagine :)). But problems requiring high-speed serial processing don't seem so much suited for the GPU. So emulating a GameBoy may be out of scope. (But emulating a cluster of low-power machines might be considered.)


I would think a project dealing with a game architecture that targets multiple core CPUs and GPUs would be interesting. I think this is still an area where a lot of work is being done. In order to take advantage of current and future computer hardware, new game architectures are going to be needed. I went to GDC 2008 and there were ome talks related to this. Gamebryo had an interesting approach where they create threads for processing computations. You can designate the number of cores you want to use so that if you don't starve out other libraries that might be multi-core. I imagine the computations could be targeted to GPUs as well. Other approaches included targeting different systems for different cores so that computations could be done in parallel. For instance, the first split a talk suggested was to put the renderer on its own core and the rest of the game on another. There are other more complex techniques but it all basically boils down to how do you get the data around to the different cores.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜