开发者

Multiple characters in a character constant

Some C compilers permit multiple characters in a character constant. This means that writing 'yes' instead of "yes" may well go undetected. Source: C traps and pitfalls

Can anyone give an example of this 开发者_JAVA技巧where multiple characters are allowed in a character constant?


As Code Monkey cited, it is implementation defined and implementation varies -- it isn't just a BigEndian/LittleEndian and charset difference. I've tested four implementations (all using ASCII) with the program

#include <stdio.h>

int main()
{
    unsigned value = 'ABCD';
    char* ptr = (char*)&value;

    printf("'ABCD' = %02x%02x%02x%02x = %08x\n", ptr[0], ptr[1], ptr[2], ptr[3], value);
    value = 'ABC';
    printf("'ABC'  = %02x%02x%02x%02x = %08x\n", ptr[0], ptr[1], ptr[2], ptr[3], value);
    return 0;
}

and I got four different results

Big endian (AIX, POWER, IBM compiler)

'ABCD' = 41424344 = 41424344
'ABC'  = 00414243 = 00414243

Big endian (Solaris, Sparc, SUN compiler)

'ABCD' = 44434241 = 44434241
'ABC'  = 00434241 = 00434241

Little endian (Linux, x86_64, gcc)

'ABCD' = 44434241 = 41424344
'ABC'  = 43424100 = 00414243

Little endian (Solaris, x86_64, Sun compiler)

'ABCD' = 41424344 = 44434241
'ABC'  = 41424300 = 00434241


You could use it in a case statement, I guess, but I wouldn't recommend it.

'yes' is a multicharacter constant. Its type is int, and its value is implementation dependent. So like you already stated, it's up to the compiler.

so int foo = 'yes';

ARM, section 2.5.2, page 9:

"A character constant is one or more characters enclosed in single quotes, as in 'x'."

Later on the same page:

"Multicharacter constants have type int. The value of a multicharacter constant is implementation dependent. For example, the value of 'AB' could reasonably be expected to be 'A' 'B' and ('A'<<8)+'B' on three different implementations. Multicharacter constants are usually best avoided."

and

Quoting from the ANSI C specification (to which C++ makes some attempt to be compatible):

3.1.3.4 Character Constants Semantics

An integer charcter constant has type int [note that it has type char in C++]...The value of an integer character constant containing more than one character...is implementation-defined.


Multi-character constants are allowed in all contexts where single-character constants are allowed.

As for where they'd actually be used, I've seen code that uses multi-character constants to create legible unique values. For example, assuming that int is 4 bytes, 'ABCD' and 'EFGH' are likely to be distinct. (This isn't guaranteed by the language; the implementation must document the mapping, but it needn't be reasonable.) And assuming a reasonable mapping, you'll likely see "ABCD" or "EFGH" in the object code. Not the best idea in the world, but it can work if you don't care much about portability.

Incidentally, all conforming C compilers support multi-character constants (by definition; a compiler that doesn't support them is non-conforming).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜