Culling techniques for rendering lots of cubes
I am working on a personal learning project to make a Minecraft clone. It is working very well aside from one thing. Similar to Minecraft, my terrain has lots of cubes stacked on the Y so you can dig down. Although I do frustum culling, this still means that I uselessly draw all the layers of cubes below me. The cubes are X, Y and Z ordered (although only in 1 direction, so its not technically Z ordered to the camera). I basically from the player's position only add pointers to cubes around the player. I then do frustum culling against these. I do not do oct tree subdivision. I thought of simply not rendering the layers below the player, except this does not work if the player looks down into a hole. Given this, how could I avoid rendering cubes below me that I cannot see, or also cubes that are hidden by other cubes.
Thanks
void CCubeGame::SetPlayerPosition()
{
PlayerPosition.x = Camera.x / 3;
PlayerPosition.y = ((Camera.y - 2.9) / 3) - 1;
PlayerPosition.z = Camera.z / 3;
}
void CCubeGame::SetCollids()
{
SetPlayerPosition();
int xamount = 70;
int zamount = 70;
int yamount = 17;
int xamountd = xamount * 2;
int zamountd = zamount * 2;
int yamountd = yamount * 2;
PlayerPosition.x -= xamount;
PlayerPosition.y -= yamount;
PlayerPosition.z -= zamount;
collids.clear();
CBox* tmp;
    for(int i = 0; i < xamountd; ++i)
    {
        for(int j = yamountd; j > 0; --j)
        {
            for(int k = zamountd; k > 0; --k)
            {
                tmp = GetCube(PlayerPosition.x + i, PlayerPosition.y + j, PlayerPosition.z + k);
                if(tmp != 0)
        开发者_如何转开发        {
                    if(frustum.sphereInFrustum(tmp->center,25) != NULL)
                    {
                        collids.push_back(tmp);
                    }
                }
            }
        }
}
Here is what I've learned while writing my own clone:
- Don't just dump every cube into OpenGL, but also don't worry about doing all of the visibility pruning yourself. As another answer stated, check all 6 faces to see if they are fully occluded by an adjacent block. Only render faces that could be visible. This roughly reduces your face count from a cubic term (a volume of cubes n*n*n) to a squared term (surface of only about n*n).
- OpenGL can do view frustrum culling much faster than you can. Once you have rendered all of your surface faces into a display list or VBO, just send the entire blob to OpenGL. If you break your geometry into slices (or what Minecraft calls chunks) you might avoid drawing the chunks you can easily determine are behind the camera.
- Render your entire geometry into a display list (or lists) and redraw that each time. This is an easy step to take if you're using immediate mode because you just wrap your existing code in glNewList/glEndList and redraw with glCallList. Reducing the OpenGL call count (per frame) will have a vastly bigger impact than reducing the total volume of polygons to render.
- Once you see how much longer it takes to generate the display lists than to draw them, you'll start thinking about how to put the updates into a thread. This is where conversion to VBOs pays off: The thread renders into plain old arrays (adding 3 floats to an array instead of calling glVertex3f, for example) and then the GL thread only has to load those into the card with glBufferSubData. You win twice: The code can run in a thread, and it can "draw" a point with 3 array writes instead of 3 function calls.
Other things I've noticed:
VBOs and display lists have very similar performance. It's quite possible that a given OpenGL implementation uses a VBO internally to store a display list. I skipped right by vertex arrays (a sort of client-side VBO) so I'm not sure about those. Use the ARB extension version of VBOs instead of the GL 1.5 standard because the Intel drivers only implement the extension (despite claiming to support 1.5) and the nvidia and ATI drivers don't care.
Texture atlases rule. If you are using one texture per face, look at how atlases work.
If you want to see my code, find me on github.
Render front to back. To do so you don't need sorting, use octrees. The leaves won't be individual cubes, rather larger groups of those.
A mesh for each such leaf should be cached in a vertex buffer. When you generate this mesh do not generate all the cubes in a brute-force manner. Instead, for each cube face check if it has an opaque neighbor within the same leaf, if so you don't need to generate this face at all. Thus you render only the surface between solid cubes and empty space. You can also unify neighboring faces with the same material into a single long rectangle. You can also separate the mesh to six sets, one set for each principal direction: +/-XYZ faces. Draw only those sets of faces that may face the camera.
Rendering front to back doesn't help that much by itself. However you can use occlusion culling offered by modern hardware to benefit from this ordering. Before rendering an octree leaf, check if its bbox passes the occlusion query. If it doesn't pass you don't need to draw it at all.
Alternative approach to occlusion query may be ray-tracing. Ray tracing is good for rendering such environment. You can cast a sparse set of rays to approximate what leaves are visible and draw those leaves only. However this will underestimate the visibility set.
Like others I've been playing around with a block world "engine" using Ogre and writing some articles as I go (see Block World Articles). The basic approach I've been taking is:
- Only create the visible faces of blocks (not faces between blocks).
- Split up world into smaller chunks (only necessary for faster updating of individual blocks).
- Combine block textures into one texture file (texture atlas).
Just using these can get you very good performance on large simple block worlds (for example, 1024x1024x1024 on decent hardware).
I am currently working on a minecraft clone in python/pyglet, just for curiosity.
I break down the data into chunks, as in minecraft, and then for each chunk create an opengl displaylist based on cube visibility. I then perform simple 2d frustum culling on these chunks and call each display list within a certain distance of the player.
On cube addition / removal I recreate the displaylist for the chunk.
There is no occlusion culling except for cubes which are completely surrounded by other cubes.
For simple scenes this can reach over 600fps on a modest gfx card with a view distance of around 200 cubes.
Octtree etc. will work for sure, but another possible way to solve your specific problem might be to store to add an usigned char visible to every cube object. The value of the visible field is calculated as following:
- if the right side (looking along the x axis) has no neighbor, then the first bit (1) will be set.
- if the left side (looking along the negative x axis) has no neighbor, then the 2nd bit (2) will be set.
- if the front side (looking along the z axis) has no neighbor, then the 3rd bit (4) will be set
- ... and so on, so that you have 1 bit for each of the 6 sides of a cube
Whenever the player digs a cube away, you have to update the visible field of all of it's neighbor cubes.
So, but how will this help? If a cube has a visible value of 0 then it's easy - the cube will be never shown. But let's say the cube has a visible value of 1. Then the cube (might) only be visible, if Xplayer < Xcube. The other sides work similar, and I think such a function which decides if a cube might be visible will be quite fast and is able to skip a lot of hidden cubes.
The disadvantage of that, is that this test is only a per-cube test and you can't skip complete groups with that. So, maybe an octtree (for skipping complete regions) and such a visible field like described here, for skipping the high number of hidden cubes (Since such cubes are stacked, the number of hidden cubes will be much higher than the number of the visible ones) inside that region might be a good solution.
You can use PVS(Potentially visible set) for this, although its generally for terrain, the same principles apply, cull what cannot be seen. Gamedev.net has a terrain morphing article covering this as well.
In case only drawing is the problem (and not rotation of unused vertices), couldn't be something like c-buffer useful? I used it with quite a success, it requires sorted polygons (for example by painter's algorithm) and nearly zero memory (in contrast to z-buffer).
Only keep track of the cubes describing your surface. You can do this by a simple data-structure where each cube keeps references to it's neighbors. On any modern graphic card it shouldn't be a problem to push all those triangles. Render from back to front. Also only render cubes closer then a specific distance to the viewer. The "world" could start with a huge "cube-of-cubes", the outer shells of a cube, made by cubes. If someone digs down, you have the check if the neighbor locations already contains cubes or not, if not you create those cubes and link them in.
Example: Digging down one flat surface: Remove cube that's positioned where the digging is done, add 9 new cubes underneath (But check if those positions are already in use, in case use those), link together the surface but linking the new cubes to the neighbor cubes of the one removed.
So: For a world containing 1000x1000x1000 cubes you would get: 1000*1000*2+998*999*4 = 5988008 instead of 1000*1000*1000 = 1000000000, or a factor 167 less number of cubes.
Of course you shouldn't draw all those cubes, start with simple distance to viewer to do a sub select.
You could also go into grouping 8 cubes(groups) together as 1 group, and then continue like that until on the top level you have just one cube-group (oct-tree already mentioned). This tree can be used to ray-trace what part of the world you need to draw and not draw. If a group contains references to 8 other cubes-groups, those behind is not shown. With behind in this would be those cube-groups that does not intersect or lay outside of a ray-traced cone starting from the user and passes just by the edge of the group used for testing. (Bad description, but I hope you would get some hints anyhow on what could be done for optimization). This would likely not be needed on todays graphics cards anyhow.
Good luck with your project.
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论