开发者

3.9.1 Fundamental types

C++ Standard §3.9.1 Fundamental types

Objects declared as characters (char) shall be large enoug开发者_如何学运维h to store any member of the implementation’s basic character set. If a character from this set is stored in a character object, the integral value of that character object is equal to the value of the single character literal form of that character. It is implementation-defined whether a char object can hold negative values. Characters can be explicitly declared unsigned or signed. Plain char, signed char, and unsigned char are three distinct types.<...>

I could not make sense of unsigned char.

A number may be +1 or -1.

I can not think -A and +A in similar manner.

What is the Historical reason of introducing unsigned char.


A char is actually an integral type. It is just that the type is also used to represent a character too. Since it is an integral type, it is valid to talk about signedness.

(I don't know exactly about the historical reason. Probably to save a keyword for byte by conflating it with char.)


In C (and thus C++), char does not mean character. It means a byte (int_least8_t). This is a historical legacy from the pre-Unicode days when a characters could actually fit in a char, but is now a flaw in the language.

Since char is really a small integer, having signed char and unsigned char makes sense. There are actually three distinct char types: char, signed char, and unsigned char. A common convention is that unsigned char represents bytes while plain char represents characters UTF-8 code units.


Computers do not "understand" the concept of alphabets or characters; they only work on numbers. So a bunch of people got together and agreed on what number maps to what letter. The most common one in use is ASCII (although the language does not guarantee that).

In ASCII, the letter A has the code 65. In environments using ASCII, the letter A would be represented by the number 65.

The char datatype also serves as an integral type - meaning that it can hold just numbers, so unsigned and signed was allowed. On most platforms I've seen, char is a single 8-bit byte.


You're reading too much in to it. A character is a small integral type that can hold a character. End of story. Unsigned char was never introduced or intended, it's just how it is, because char is an integral type identical to int or long or short, it's just the size that's different. The fact is that there's little reason to use unsigned char, but people do if they want one-byte unsigned integral storage.


If you want a small memory foot print and want to store a number than signed and unsigned char are usefull.

unsigned char is needed if you want to use a value between 128-255

unsigned char score = 232;

signed char is usfull if you want to store the difference between two characters.

signed char diff = 'D' - 'A';

char is distinct from the other two because you can not assume it is either.


You can use the the overflow from 255 to 0? (I don't know. Just a guess)

Maybe it is not only about characters but also about numbers between -128 and 127, and 0 to 255.


Think of the ASCII character set.

Historically, all characters used for text in computing were defined by the ASCII character set. Each character was represented by an 8 bit byte, which was unsigned, hence each character had a value in the range of 0 - 255.

The word character was reduced to char for coding.

An 8 bit char used the same memory as an 8 bit byte and as such they were interchangeable as far as a compiler was concerned.

The compiler directive unsigned (all numbers were signed by default as twos compliment is used to represent negative numbers in binary) when applied to a byte or a char forced them to have a value in the range 0-255.

If unsigned then then had a value of -128 - +127.

Nowadays with the advent of UNICODE and multiple byte character sets this relationship between byte and char no longer exists.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜