When printf is an address of a variable, why use void*?
I saw some usage of (void*)
in printf()
.
If I want to print a variable's address, can I do it like this:
int a = 19;
printf("%d", &a);
- I think,
&a
isa
's address which is just an integer, right? Many articles I read use something开发者_运维百科 like this:
printf("%p", (void*)&a);
- What does
%p
stand for? (A pointer?) - Why use
(void*)
? Can't I use(int)&a
instead?
Pointers are not numbers. They are often internally represented that way, but they are conceptually distinct.
void*
is designed to be a generic pointer type. Any pointer value (other than a function pointer) may be converted to void*
and back again without loss of information. This typically means that void*
is at least as big as other pointer types.
printf
s "%p"
format requires an argument of type void*
. That's why an int*
should be cast to void*
in that context. (There's no implicit conversion because it's a variadic function; there's no declared parameter, so the compiler doesn't know what to convert it to.)
Sloppy practices like printing pointers with "%d"
, or passing an int*
to printf
with a "%p"
format, are things that you can probably get away with on most current systems, but they render your code non-portable. (Note that it's common on 64-bit systems for void*
and int
to be different sizes, so printing pointers with %d"
is really non-portable, not just theoretically.)
Incidentally, the output format for "%p"
is implementation-defined. Hexadecimal is common, (in upper or lower case, with or without a leading "0x"
or "0X"
), but it's not the only possibility. All you can count on is that, assuming a reasonable implementation, it will be a reasonable way to represent a pointer value in human-readable form (and that scanf
will understand the output of printf
).
The article you read is entirely correct. The correct way to print an int*
value is
printf("%p", (void*)&a);
Don't take the lazy way out; it's not at all difficult to get it right.
Suggested reading: Section 4 of the comp.lang.c FAQ. (Further suggested reading: All the other sections.
EDIT:
In response to Alcott's question:
There is still one thing I don't quite understand.
int a = 10; int *p = &a;
, so p's value is a's address in mem, right? If right, then p's value will range from 0 to 2^32-1 (if cpu is 32-bit), and an integer is 4-byte on 32-bit OS, right? then What's the difference between the p's value and an integer? Can p's value go out of the range?
The difference is that they're of different types.
Assume a system on which int
, int*
, void*
, and float
are all 32 bits (this is typical for current 32-bit systems). Does the fact that float
is 32 bits imply that its range is 0 to 232-1? Or -231 to 231-1? Certainly not; the range of float (assuming IEEE representation) is approximately -3.40282e+38 to +3.40282e+38, with widely varying resolution across the range, plus exotic values like negative zero, subnormalized numbers, denormalized numbers, infinities, and NaNs (Not-a-Number). int
and float
are both 32 bits, and you can take the 32 bits of a float
object and treat it as an int
representation, but the result won't have any straightforward relationship to the value of the float
. The second low-order bit of an int
, for example, has a specific meaning; it contributes 0 to the value if it's 0, and 2 to the value if it's 1; the corresponding bit of a float
has a meaning, but it's quite different (it contributes a value that depends on the value of the exponent).
The situation with pointers is quite similar. A pointer value has a meaning: it's the address of some object (or any of several other things, but we'll set that aside for now). On most current systems, interpreting the bits of a pointer object as if it were an integer gives you something that makes sense on the machine level. But the language itself does not guarantee, or even hint, that that's the case.
Pointers are not numbers.
A concrete example: some years ago, I ran across some code that tried to compute the difference in bytes between two addresses by casting to integers. It was something like this:
unsigned char *p0;
unsigned char *p1;
long difference = (unsigned long)p1 - (unsigned long)p0;
If you assume that pointers are just numbers, representing addresses in a linear monolithic address space, then this code makes sense. But that assumption is not supported by the language. And in fact, there was a system on which that code was intended to run (the Cray T90) on which it simply would not have worked. The T90 had 64-bit pointers pointing to 64-bit words. Byte pointers were synthesized in software by storing an offset in the 3 high-order bits of a pointer object. Subtracting two pointers in the above manner, if they both had 0 offsets, would give you the number of words, not bytes, between the addresses. And if they had non-0 offsets, it would give you meaningless garbage. (Conversion from a pointer to an integer would just copy the bits; it could have done the work to give you a meaningful byte index, but it didn't.)
The solution was simple: drop the casts and use pointer arithmetic:
long difference = p1 - p0;
Other addressing schemes are possible. For example, an address might consist of a descriptor that (perhaps indirectly) references a block of memory, plus an offset within that block.
You can assume that addresses are just numbers, that the address space is linear and monolithic, that all pointers are the same size and have the same representation, that a pointer can be safely converted to int
, or to long
, and back again without loss of information. And the code you write based on those assumptions will probably work on most current systems. But it's entirely possible that some future systems will again use a different memory model, and your code will break.
If you avoid making any assumptions beyond what the language actually guarantees, your code will be far more future-proof. And even leaving portability issues aside, it will probably be cleaner.
So much insanity present here...
%p
is generally the correct format specifier to use if you just want to print out a representation of the pointer. Never, ever use %d
.
The length of an int
and the length of a pointer (void*
or otherwise) have no relationship. Most data models on i386 just happen to have 32-bit int
s AND 32-bit pointers -- other platforms, including x86-64, are not the same! (This is also historically known as "all the world's a VAX syndrome".) http://en.wikipedia.org/wiki/64-bit#64-bit_data_models
If for some reason you want to hold a memory address in an integral variable, use the right types! intptr_t
and uintptr_t
. They're in stdint.h
. See http://en.wikipedia.org/wiki/Stdint.h#Integers_wide_enough_to_hold_pointers
Although it the vast majority of C implementations store pointers to all kinds of objects using the same representation, the C Standard does not require that all implementations do so, nor does it even provide any means by which a program which would exploit commonality of representations could test whether an implementation follows the common practice and refuse to run if an implementation doesn't.
If on some particular platform, an int*
held a word address, while both char*
and void*
combine a word address with a word that identifies a byte within a word, passing an int*
to a function that is expecting to retrieve a variadic argument of type char*
or void*
would result in that function trying to fetch more data from the stack (a word address plus the supplemental word) than had been pushed (just the word address). This could cause the system to malfunction in unpredictable ways.
Many compilers for commonplace platforms that use the same representation for all pointers will process an action which passes a non-void pointer precisely the same way as they would process an action which casts the pointer to void*
before passing it. They thus have no reason to care about whether the pointer type that is passed as a variadic argument will precisely match the pointer type expected by the recipient. Although the Standard could have specified that such implementations which would have no reason to care about pointer types should behave as though the pointers were cast to void*
, the authors of C89 Standard avoided describing anything which wouldn't be common to all conforming compilers. The Standard's terminology for a construct that 99% of implementations should process identically, but 1% would might process unpredictably, is "Undefined Behavior". Implementations may, and often should, extend the semantics of the language by specifying how they will treat such constructs, but that's a Quality of Implementation issue outside the Standard's jurisdiction.
In C void *
is an un-typed pointer. void
does not mean void... it means anything. Thus casting to void *
would be the same as casting to "pointer" in another language.
Using (int *)&a
should work too... but the stylistic point of saying (void *)
is to say -- I don't care about the type -- just that it is a pointer.
Note: It is possible for an implementation of C to cause this construct to fail and still meet the requirements of the standards. I don't know of any such implementations, but it is possible.
精彩评论