System where 1 byte != 8 bit? [duplicate]

2023-02-21 23:34 问答作者：

This question already has answers here: What platforms have something other than 8-bit char? (14 answers) Closed 7 years ago.

All the time I read sentences like

don't rely on 1 byte being 8 bit in size

use CHAR_BIT instead of 8 as a constant to convert between bits and bytes

et cetera. What re开发者_运维知识库al life systems are there today, where this holds true? _{(I'm not sure if there are differences between C and C++ regarding this, or if it's actually language agnostic. Please retag if neccessary.)}

On older machines, codes smaller than 8 bits were fairly common, but most of those have been dead and gone for years now.

C and C++ have mandated a minimum of 8 bits for char, at least as far back as the C89 standard. [Edit: For example, C90, §5.2.4.2.1 requires CHAR_BIT >= 8 and UCHAR_MAX >= 255. C89 uses a different section number (I believe that would be §2.2.4.2.1) but identical content]. They treat "char" and "byte" as essentially synonymous [Edit: for example, CHAR_BIT is described as: "number of bits for the smallest object that is not a bitfield (byte)".]

There are, however, current machines (mostly DSPs) where the smallest type is larger than 8 bits -- a minimum of 12, 14, or even 16 bits is fairly common. Windows CE does roughly the same: its smallest type (at least with Microsoft's compiler) is 16 bits. They do not, however, treat a char as 16 bits -- instead they take the (non-conforming) approach of simply not supporting a type named char at all.

TODAY, in the world of C++ on x86 processors, it is pretty safe to rely on one byte being 8 bits. Processors where the word size is not a power of 2 (8, 16, 32, 64) are very uncommon.

IT WAS NOT ALWAYS SO.

The Control Data 6600 (and its brothers) Central Processor used a 60-bit word, and could only address a word at a time. In one sense, a "byte" on a CDC 6600 was 60 bits.

The DEC-10 byte pointer hardware worked with arbitrary-size bytes. The byte pointer included the byte size in bits. I don't remember whether bytes could span word boundaries; I think they couldn't, which meant that you'd have a few waste bits per word if the byte size was not 3, 4, 9, or 18 bits. (The DEC-10 used a 36-bit word.)

Unless you're writing code that could be useful on a DSP, you're completely entitled to assume bytes are 8 bits. All the world may not be a VAX (or an Intel), but all the world has to communicate, share data, establish common protocols, and so on. We live in the internet age built on protocols built on octets, and any C implementation where bytes are not octets is going to have a really hard time using those protocols.

It's also worth noting that both POSIX and Windows have (and mandate) 8-bit bytes. That covers 100% of interesting non-embedded machines, and these days a large portion of non-DSP embedded systems as well.

From Wikipedia:

The size of a byte was at first selected to be a multiple of existing teletypewriter codes, particularly the 6-bit codes used by the U.S. Army (Fieldata) and Navy. In 1963, to end the use of incompatible teleprinter codes by different branches of the U.S. government, ASCII, a 7-bit code, was adopted as a Federal Information Processing Standard, making 6-bit bytes commercially obsolete. In the early 1960s, AT&T introduced digital telephony first on long-distance trunk lines. These used the 8-bit µ-law encoding. This large investment promised to reduce transmission costs for 8-bit data. The use of 8-bit codes for digital telephony also caused 8-bit data "octets" to be adopted as the basic data unit of the early Internet.

As an average programmer on mainstream platforms, you do not need to worry too much about one byte not being 8 bit. However, I'd still use the CHAR_BIT constant in my code and assert (or better static_assert) any locations where you rely on 8 bit bytes. That should put you on the safe side.

(I am not aware of any relevant platform where it doesn't hold true).

Firstly, the number of bits in char does not formally depend on the "system" or on "machine", even though this dependency is usually implied by common sense. The number of bits in char depends only on the implementation (i.e. on the compiler). There's no problem implementing a compiler that will have more than 8 bits in char for any "ordinary" system or machine.

Secondly, there are several embedded platforms where sizeof(char) == sizeof(short) == sizeof(int) , each having 16 bits (I don't remember the exact names of these platforms). Also, the well-known Cray machines had similar properties with all these types having 32 bits in them.

I do a lot of embedded and currently working on DSP code with CHAR_BIT of 16

In history, there's existed a bunch of odd architectures that where not using native word sizes that where multiples of 8. If you ever come across any of these today, let me know.

The first commerical CPU by Intel was the Intel 4004 (4-bit)
PDP-8 (12-bit)

The size of the byte has historically been hardware dependent and no definitive standards exist that mandate the size.

It might just be a good thing to keep in mind if your doing lots of embedded stuff.

Adding one more as a reference, from Wikipedia entry on HP Saturn:

The Saturn architecture is nibble-based; that is, the core unit of data is 4 bits, which can hold one binary-coded decimal (BCD) digit.

继续阅读：byte c cpu-architecture history

System where 1 byte != 8 bit? [duplicate]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？