Exotic architectures the standards committees care about

2023-03-26 05:04 问答作者：

I know that the C and C++ standards leave many aspects of the language implementation-defined just because if there was an architecture with other characteristics, a standard confirming compiler for that architecture would need to emulate those parts of the language, resulting in inefficient machine code.

Surely, 40 years ago every computer had its own unique specification. However, I don't know of any architectures used today where:

CHAR_BIT != 8
signed is not two's complement (I heard Java had problems with this one).
Floating point is not IEEE 754 compliant (Edit: I meant "not in IEEE 754 binary encoding").

The reason I'm asking is that I often explain to people that it's good that C++ doesn't mandate any other low-level aspects like fixed sized types^†. It's good because unlike 'other languages' it makes your code portable when used correctly (Edit: because it can be ported to more architectures without requiring emulation of low-level aspects of the ma开发者_如何转开发chine, like e.g. two's complement arithmetic on sign+magnitude architecture). But I feel bad that I cannot point to any specific architecture myself.

So the question is: what architectures exhibit the above properties?

† uint*_ts are optional.

Take a look at this one

Unisys ClearPath Dorado Servers

offering backward compatibility for people who have not yet migrated all their Univac software.

Key points:

36-bit words
CHAR_BIT == 9
one's complement
72-bit non-IEEE floating point
separate address space for code and data
word-addressed
no dedicated stack pointer

Don't know if they offer a C++ compiler though, but they could.

And now a link to a recent edition of their C manual has surfaced:

Unisys C Compiler Programming Reference Manual

Section 4.5 has a table of data types with 9, 18, 36, and 72 bits.

Exotic architectures the standards committees care about

None of your assumptions hold for mainframes. For starters, I don't know of a mainframe which uses IEEE 754: IBM uses base 16 floating point, and both of the Unisys mainframes use base 8. The Unisys machines are a bit special in many other respects: Bo has mentioned the 2200 architecture, but the MPS architecture is even stranger: 48 bit tagged words. (Whether the word is a pointer or not depends on a bit in the word.) And the numeric representations are designed so that there is no real distinction between floating point and integral arithmetic: the floating point is base 8; it doesn't require normalization, and unlike every other floating point I've seen, it puts the decimal to the right of the mantissa, rather than the left, and uses signed magnitude for the exponent (in addition to the mantissa). With the results that an integral floating point value has (or can have) exactly the same bit representation as a signed magnitude integer. And there are no floating point arithmetic instructions: if the exponents of the two values are both 0, the instruction does integral arithmetic, otherwise, it does floating point arithmetic. (A continuation of the tagging philosophy in the architecture.) Which means that while int may occupy 48 bits, 8 of them must be 0, or the value won't be treated as an integer.

Full IEEE 754 compliance is rare in floating-point implementations. And weakening the specification in that regard allows lots of optimizations.

For example the subnorm support differers between x87 and SSE.

Optimizations like fusing a multiplication and addition which were separate in the source code slightly change the results too, but is nice optimization on some architectures.

Or on x86 strict IEEE compliance might require certain flags being set or additional transfers between floating point registers and normal memory to force it to use the specified floating point type instead of its internal 80bit floats.

And some platforms have no hardware floats at all and thus need to emulate them in software. And some of the requirements of IEEE 754 might be expensive to implement in software. In particular the rounding rules might be a problem.

My conclusion is that you don't need exotic architectures in order to get into situations were you don't always want to guarantee strict IEEE compliance. For this reason were few programming languages guarantee strict IEEE compliance.

I found this link listing some systems where CHAR_BIT != 8. They include

some TI DSPs have CHAR_BIT == 16

BlueCore-5 chip (a Bluetooth chip from Cambridge Silicon Radio) which has CHAR_BIT == 16.

And of course there is a question on Stack Overflow: What platforms have something other than 8-bit char

As for non two's-complement systems there is an interesting read on comp.lang.c++.moderated. Summarized: there are platforms having ones' complement or sign and magnitude representation.

I'm fairly sure that VAX systems are still in use. They don't support IEEE floating-point; they use their own formats. Alpha supports both VAX and IEEE floating-point formats.

Cray vector machines, like the T90, also have their own floating-point format, though newer Cray systems use IEEE. (The T90 I used was decommissioned some years ago; I don't know whether any are still in active use.)

The T90 also had/has some interesting representations for pointers and integers. A native address can only point to a 64-bit word. The C and C++ compilers had CHAR_BIT==8 (necessary because it ran Unicos, a flavor of Unix, and had to interoperate with other systems), but a native address could only point to a 64-bit word. All byte-level operations were synthesized by the compiler, and a void* or char* stored a byte offset in the high-order 3 bits of the word. And I think some integer types had padding bits.

IBM mainframes are another example.

On the other hand, these particular systems needn't necessarily preclude changes to the language standard. Cray didn't show any particular interest in upgrading its C compiler to C99; presumably the same thing applied to the C++ compiler. It might be reasonable to tighten the requirements for hosted implementations, such as requiring CHAR_BIT==8, IEEE format floating-point if not the full semantics, and 2's-complement without padding bits for signed integers. Old systems could continue to support earlier language standards (C90 didn't die when C99 came out), and the requirements could be looser for freestanding implementations (embedded systems) such as DSPs.

On the other other hand, there might be good reasons for future systems to do things that would be considered exotic today.

CHAR_BITS

According to gcc source code:

CHAR_BIT is 16 bits for 1750a, dsp16xx architectures.
CHAR_BIT is 24 bits for dsp56k architecture.
CHAR_BIT is 32 bits for c4x architecture.

You can easily find more by doing:

find $GCC_SOURCE_TREE -type f | xargs grep "#define CHAR_TYPE_SIZE"

find $GCC_SOURCE_TREE -type f | xargs grep "#define BITS_PER_UNIT"

if CHAR_TYPE_SIZE is appropriately defined.

IEEE 754 compliance

If target architecture doesn't support floating point instructions, gcc may generate software fallback witch is not the standard compliant by default. More than, special options (like -funsafe-math-optimizations witch also disables sign preserving for zeros) can be used.

IEEE 754 binary representation was uncommon on GPUs until recently, see GPU Floating-Point Paranoia.

EDIT: a question has been raised in the comments whether GPU floating point is relevant to the usual computer programming, unrelated to graphics. Hell, yes! Most high performance thing industrially computed today is done on GPUs; the list includes AI, data mining, neural networks, physical simulations, weather forecast, and much much more. One of the links in the comments shows why: an order of magnitude floating point advantage of GPUs.

Another thing I'd like to add, which is more relevant to the OP question: what did people do 10-15 years ago when GPU floating point was not IEEE and when there was no API such as today's OpenCL or CUDA to program GPUs? Believe it or not, early GPU computing pioneers managed to program GPUs without an API to do that! I met one of them in my company. Here's what he did: he encoded the data he needed to compute as an image with pixels representing the values he was working on, then used OpenGL to perform the operations he needed (such as "gaussian blur" to represent a convolution with a normal distribution, etc), and decoded the resulting image back into an array of results. And this still was faster than using CPU!

Things like that is what prompted NVidia to finally make their internal data binary compatible with IEEE and to introduce an API oriented on computation rather than image manipulation.

继续阅读：architecture c

Exotic architectures the standards committees care about

CHAR_BITS

IEEE 754 compliance

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

CHAR_BITS

IEEE 754 compliance

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？