Once upon a time, when > was faster than < ... Wait, what?

2023-04-03 13:44 问答作者：

I am reading an awesome OpenGL tutorial. It's really great, trust me. The topic I am currently at is Z-buffer. Aside from explaining what's it all about, the author mentions that we can perform custom depth tests, such as GL_LESS, GL_ALWAYS, etc. He also explains that the actual meaning of depth values (which is top and which isn't) can also be customized. I understand so far. And then the author says something unbelievable:

The range zNear can be greater than the range zFar; if it is, then the window-space values will be reversed, in terms of what constitutes closest or farthest from the viewer.

Earlier, it was said that the window-space Z value of 0 is closest and 1 is farthest. However, if our clip-space Z values were negated, the depth of 1 would be closest to the view and the depth of 0 would be farthest. Yet, if we flip the direction of the depth test (G开发者_运维问答L_LESS to GL_GREATER, etc), we get the exact same result. So it's really just a convention. Indeed, flipping the sign of Z and the depth test was once a vital performance optimization for many games.

If I understand correctly, performance-wise, flipping the sign of Z and the depth test is nothing but changing a < comparison to a > comparison. So, if I understand correctly and the author isn't lying or making things up, then changing < to > used to be a vital optimization for many games.

Is the author making things up, am I misunderstanding something, or is it indeed the case that once < was slower (vitally, as the author says) than >?

Thanks for clarifying this quite curious matter!

_{Disclaimer: I am fully aware that algorithm complexity is the primary source for optimizations. Furthermore, I suspect that nowadays it definitely wouldn't make any difference and I am not asking this to optimize anything. I am just extremely, painfully, maybe prohibitively curious.}

If I understand correctly, performance-wise, flipping the sign of Z and the depth test is nothing but changing a < comparison to a > comparison. So, if I understand correctly and the author isn't lying or making things up, then changing < to > used to be a vital optimization for many games.

I didn't explain that particularly well, because it wasn't important. I just felt it was an interesting bit of trivia to add. I didn't intend to go over the algorithm specifically.

However, context is key. I never said that a < comparison was faster than a > comparison. Remember: we're talking about graphics hardware depth tests, not your CPU. Not operator<.

What I was referring to was a specific old optimization where one frame you would use GL_LESS with a range of [0, 0.5]. Next frame, you render with GL_GREATER with a range of [1.0, 0.5]. You go back and forth, literally "flipping the sign of Z and the depth test" every frame.

This loses one bit of depth precision, but you didn't have to clear the depth buffer, which once upon a time was a rather slow operation. Since depth clearing is not only free these days but actually faster than this technique, people don't do it anymore.

The answer is almost certainly that for whatever incarnation of chip+driver was used, the Hierarchical Z only worked in the one direction - this was a fairly common issue back in the day. Low level assembly/branching has nothing to do with it - Z-buffering is done in fixed function hardware, and is pipelined - there is no speculation and hence, no branch prediction.

It has to do with flag bits in highly tuned assembly.

x86 has both jl and jg instructions, but most RISC processors only have jl and jz (no jg).

继续阅读：c cpu gpu opengl optimization

Once upon a time, when > was faster than < ... Wait, what?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？