My experiment shows that rendering order affects performance a lot in TBR architecture, why?
TBR chips perform HSR (hidden surface removal) before fragment processing, so only the visible pixels are rendered. This feature results in no necessary sorting opaque objects from front to back. But I have done a experiment on my iPhone 3GS. By comparing the frame time, rendering opaque objects from front to back is much faster than back to front. Why 开发者_运维问答does it show this result? The performance should be very close when objects are rendered in whichever order.
I believe that the optimization to not perform fragment processing is done by using the Z-buffer to determine if a pixel is visible or not (and early out the pipeline if the pixel isn't visible). As a result rendering back-to-front will be worst-case for that optimization (no optimization possible) and front-to-back is best-case (all eventually hidden pixels are already hidden).
If true, that contradicts Apple's documentation on the topic:
- Do not waste CPU time sorting objects front to back. OpenGL ES for iPhone and iPod touch implement a tile-based deferred rendering model that makes this unnecessary. See “Tile-Based Deferred Rendering” for more information.
Do sort objects by their opacity:
- Draw opaque objects first.
- Next draw objects that require alpha testing (or in an OpenGL ES 2.0 based application, objects that require the use of discard in the fragment shader.) Note that these operations have a performance penalty, as described in “Avoid Alpha Test and Discard.”
- Finally, draw alpha-blended objects.
As well as the documentation here:
Another advantage of deferred rendering is that it allows the GPU to perform hidden surface removal before fragments are processed. Pixels that are not visible are discarded without sampling textures or performing fragment processing, significantly reducing the calculations that the GPU must perform to render the scene. To gain the most benefit from this feature, you should try to draw as much of the scene with opaque content as possible and minimize use of blending, alpha testing, and the discard instruction in GLSL shaders. Because the hardware performs hidden surface removal, it is not necessary for your application to sort its geometry from front to back.
精彩评论