开发者

Tesla double precision

I am looking for the information, how double precision is hardware implemented in the tesla gpu . I have read, that two stream processors are working on the single double value, but i didn't found any official paper from nvidia.

Thanks in advance. PPS Why most GPU are computing with only single precision (because colors can be stored as RR.GG.BB.TT, where each character is a 8-Bit v开发者_运维技巧alue)?

PS google it for me didn't help


Not supporting double is not a matter of storage format like you said (RR.GG.BB.TT) but having native intrinsics (and so dedicated hardware) for handling operations on double (add, mul, madd, etc).

Anyway, most GPU supports only single precision because where most of the GPU market lies is in the gaming market and gamers don't need double precision. Also most of gamers are looking for good performance/price ratios. Implementing DP is costful in term of transistor budget (and TDP), and if games don't use double precision this is meaningless.

This is why you see high-end ATI GPUs supporting double (HD 59xx and HD 58xx, but not mid and entry-level GPUs such as HD 57xx and less).

@karlphillip: Yes you're right, IEEE754 (kind of) for GPUs like GTX 260, but current ATI and NVIDIA generation is supporting IEEE 754-2008 on high-end parts.

About hardware implementation, this are secrets IHVs usually don't tell :)


Tesla is not a GPU, it's a line of coprocessors featuring various high-end GPUs. If your Tesla has a Fermi GPU inside, it should have good double precision performance.

See the Fermi white paper, page 9.

Single precision is more important for regular GPU computing because it is sufficient for such applications.


According to Wiki:

For double precision (only supported in newer GPUs like GTX 260[12]) there are some deviations from the IEEE 754 standard: round-to-nearest-even is the only supported rounding mode for reciprocal, division, and square root. In single precision, denormals and signalling NaNs are not supported; only two IEEE rounding modes are supported (chop and round-to-nearest even), and those are specified on a per-instruction basis rather than in a control word; and the precision of division/square root is slightly lower than single precision.

There you go, they implement most of the spec of IEEE 754, but the actual implementation is probably private and secret.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜