Should I expect differences in the ouput of a cuda application between generations?

2023-03-11 01:47 问答作者：

I have some code that is compiled and tested on both Tesla and Fermi generation chipsets.

Across all Tesla generation chips (260,280,c1060) the output is consistent.

Acro开发者_StackOverflow中文版ss all Fermi generation chips (460-580, c2080) the output is consistent.

However, between the Tesla and Fermi generations the output images are subtley different.

Is this to be expected? There is floating point math in the code, and precision is my first suspicion, but I can't find any mention of it in Nvidia's docs.

You should also check out my whitepaper and webinar on floating point for NVIDIA GPUs (I'm an NVIDIA employee).

http://developer.nvidia.com/content/everything-you-ever-wanted-know-about-floating-point-were-afraid-ask

To answer the question, there are indeed numeric differences between the hardware generations. The "compute capability" tells you what features the chip has. Devices of compute capability 1.0-1.2 just have single precision. Single precision on these devices is flush-to-zero, meaning it doesn't support denormal numbers. Some operations like division and square root are not correctly rounded (they use fast hardware approximations to the functions).

Devices of compute capability 1.3 added support for double precision. Double precision is correctly rounded and supports denormals. Double precision also has a fused multiply-add, which increases precision.

Devices of compute capability 2.0 and later upgraded the single precision floating point. Now single precision is correctly rounded and supports denormals. They also have a fused multiply-add in single precision as well as in double precision.

In the Fermi Tuning Guide there is a section about IEEE 754-2008 Compliance which states:

Devices of compute capability 2.x have far fewer deviations from the IEEE 754-2008 floating point standard than devices of compute capability 1.x, particularly in single precision (Section F.2). This can cause slight changes in numeric results between devices of compute capability 1.x and devices of compute capability 2.x.

The full document is available in the downloads section of the CUDA website.

继续阅读：code-generation floating-accuracy

Should I expect differences in the ouput of a cuda application between generations?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

Best solution for private video database [closed]

imessage会显示已读吗？