Does printf("%x",1) invoke undefined behavior?
According to the C standard (6.5.2.2 paragraph 6)
If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined. If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion ar开发者_如何学运维e not compatible with those of the parameters after promotion, the behavior is undefined, except for the following cases:
- one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;
- both types are pointers to qualified or unqualified versions of a character type or void.
Thus, in general, there is nothing wrong with passing an int
to a variadic function that expects an unsigned int
(or vice versa) as long as the value passed fits in both types. However, the specification for printf
reads (7.19.6.1 paragraph 9):
If a conversion specification is invalid, the behavior is undefined. If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
No exception is made for signed/unsigned mismatch.
Does this mean that printf("%x", 1)
invokes undefined behavior?
I believe it is technically undefined, because the "correct type" for %x
is specified as unsigned int
- and as you point out, there is no exception for signed/unsigned mismatch here.
The rules for printf
are for a more specific case and thus override the rules for the general case (for another example of the specific overriding the general, it's allowable in general to pass NULL
to a function expecting a const char *
argument, but it's undefined behaviour to pass NULL
to strlen()
).
I say "technically", because I believe an implementation would need to be intentionally perverse to cause a problem for this case, given the other restrictions in the standard.
No, because %x formats an unsigned int, and the type of the constant expression 1 is int, while the value of it is expressible as an unsigned int. The operation is not UB.
It is undefined behavior, for the same reason that re-interpreting a pointer to an integer type to complementary type of opposite signedness. This isn't allowed, unfortunately, in both directions because a valid representation in one may be a trap implementation in the other.
The only reason I see that from signed to unsigned re-interpretation there may be a trap representation is this perverted case of sign representation where the unsigned type just masks out the sign bit. Unfortunately such a thing is allowed as of 6.2.6.2 of the standard. On such an architecture all negative values of the signed type may be trap representations of the unsigned type.
In your example case this is even more weird, since having 1
a trap representation for the unsigned type is in turn not allowed. So to make it a "real" example, you'd have to ask your question with a -1
.
I don't think that there is still any architecture for which people write C compilers that has these features, so definitively live would become more easy if a newer version of the standard could abolish this nasty case.
I believe it's undefined. Functions with a variable-length arguments list don't have an implicit conversion when accepting arguments, so 1
won't be cast to unsigned int
when being past to printf()
, causing undefined behavior.
TL;DR it is not UB.
As n. 'pronouns' m. pointed out in this answer, the C standard says that all non-negative values of a signed integer type have the exact same representation as the corresponding unsigned type, and therefore can be used interchangeable as long as the value is in the range of both types.
From the C99 standard 6.2.5 Types - Paragraph 9 and Footnote 31:
9 The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. 31)
31) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
The exact same text is in the C11 Standard in 6.2.5 Types - Paragraph 9 and Footnote 41.
The authors of the Standard do not generally try to explicitly mandate behavior in every imaginable corner case, especially when there is an obvious correct behavior which is shared by 100% of all implementations, and there no reason to expect any implementation to do anything else. Despite the Standard's explicit requirement that signed and unsigned types have matching memory representations for values that fit in both, it would be theoretically possible for an implementation to pass them to variadic functions differently. The Standard doesn't forbid such behavior, but I see no evidence of the authors intentionally permitting it. Most likely, they simply didn't consider such a possibility since no implementation had ever (and so far as I know, has ever) worked that way.
It would probably be reasonable for a sanitizing implementation to squawk if code uses %x on a signed value, though a quality sanitizing implementation should also provide an option to silently accept such code. There's no reason for sane implementations to do anything other than either process the passed value as unsigned or squawk if it's used in a diagnostic/sanitizing mode. While the Standard might forbid an implementation from regarding as unreachable any code that uses %x on a signed value, anyone who thinks implementations should avail themselves of such freedom should be recognized as a moron.
Programmers who are targeting exclusively sane non-diagnostic implementations shouldn't need to worry about adding casts when outputting things like "uint8_t" values, but those whose code might be fed to moronic implementations might want to add such casts to prevent compilers from the "optimizations" such implementations might impose.
精彩评论