Does printf("%x",1) invoke undefined behavior?

2023-02-03 13:00 问答作者：

According to the C standard (6.5.2.2 paragraph 6)

If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undeﬁned. If the function is deﬁned with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undeﬁned. If the function is deﬁned with a type that does not include a prototype, and the types of the arguments after promotion ar开发者_如何学运维e not compatible with those of the parameters after promotion, the behavior is undeﬁned, except for the following cases:

one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;

both types are pointers to qualiﬁed or unqualiﬁed versions of a character type or void.

Thus, in general, there is nothing wrong with passing an int to a variadic function that expects an unsigned int (or vice versa) as long as the value passed fits in both types. However, the specification for printf reads (7.19.6.1 paragraph 9):

If a conversion specification is invalid, the behavior is undeﬁned. If any argument is not the correct type for the corresponding conversion speciﬁcation, the behavior is undeﬁned.

No exception is made for signed/unsigned mismatch.

Does this mean that printf("%x", 1) invokes undefined behavior?

I believe it is technically undefined, because the "correct type" for %x is specified as unsigned int - and as you point out, there is no exception for signed/unsigned mismatch here.

The rules for printf are for a more specific case and thus override the rules for the general case (for another example of the specific overriding the general, it's allowable in general to pass NULL to a function expecting a const char * argument, but it's undefined behaviour to pass NULL to strlen()).

I say "technically", because I believe an implementation would need to be intentionally perverse to cause a problem for this case, given the other restrictions in the standard.

No, because %x formats an unsigned int, and the type of the constant expression 1 is int, while the value of it is expressible as an unsigned int. The operation is not UB.

It is undefined behavior, for the same reason that re-interpreting a pointer to an integer type to complementary type of opposite signedness. This isn't allowed, unfortunately, in both directions because a valid representation in one may be a trap implementation in the other.

The only reason I see that from signed to unsigned re-interpretation there may be a trap representation is this perverted case of sign representation where the unsigned type just masks out the sign bit. Unfortunately such a thing is allowed as of 6.2.6.2 of the standard. On such an architecture all negative values of the signed type may be trap representations of the unsigned type.

In your example case this is even more weird, since having 1 a trap representation for the unsigned type is in turn not allowed. So to make it a "real" example, you'd have to ask your question with a -1.

I don't think that there is still any architecture for which people write C compilers that has these features, so definitively live would become more easy if a newer version of the standard could abolish this nasty case.

I believe it's undefined. Functions with a variable-length arguments list don't have an implicit conversion when accepting arguments, so 1 won't be cast to unsigned int when being past to printf(), causing undefined behavior.

TL;DR it is not UB.

As n. 'pronouns' m. pointed out in this answer, the C standard says that all non-negative values of a signed integer type have the exact same representation as the corresponding unsigned type, and therefore can be used interchangeable as long as the value is in the range of both types.

From the C99 standard 6.2.5 Types - Paragraph 9 and Footnote 31:

9 The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same. 31)

31) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.

The exact same text is in the C11 Standard in 6.2.5 Types - Paragraph 9 and Footnote 41.

The authors of the Standard do not generally try to explicitly mandate behavior in every imaginable corner case, especially when there is an obvious correct behavior which is shared by 100% of all implementations, and there no reason to expect any implementation to do anything else. Despite the Standard's explicit requirement that signed and unsigned types have matching memory representations for values that fit in both, it would be theoretically possible for an implementation to pass them to variadic functions differently. The Standard doesn't forbid such behavior, but I see no evidence of the authors intentionally permitting it. Most likely, they simply didn't consider such a possibility since no implementation had ever (and so far as I know, has ever) worked that way.

It would probably be reasonable for a sanitizing implementation to squawk if code uses %x on a signed value, though a quality sanitizing implementation should also provide an option to silently accept such code. There's no reason for sane implementations to do anything other than either process the passed value as unsigned or squawk if it's used in a diagnostic/sanitizing mode. While the Standard might forbid an implementation from regarding as unreachable any code that uses %x on a signed value, anyone who thinks implementations should avail themselves of such freedom should be recognized as a moron.

Programmers who are targeting exclusively sane non-diagnostic implementations shouldn't need to worry about adding casts when outputting things like "uint8_t" values, but those whose code might be fed to moronic implementations might want to add such casts to prevent compilers from the "optimizations" such implementations might impose.

继续阅读：c language-lawyer printf standards variadic-functions

Does printf("%x",1) invoke undefined behavior?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？