Is apparent NULL pointer dereference in C actually pointer arithmetic?
I've got this piece of code. It appears to dereference a null pointer here, but then bitwise-ANDs the result with unsigned int
. I really don't understand the whole 开发者_如何学Gopart. What is it intended to do? Is this a form of pointer arithmetic?
struct hi
{
long a;
int b;
long c;
};
int main()
{
struct hi ob={3,4,5};
struct hi *ptr=&ob;
int num= (unsigned int) & (((struct hi *)0)->b);
printf("%d",num);
printf("%d",*(int *)((char *)ptr + (unsigned int) & (((struct hi *)0)->b)));
}
The output I get is 44. But how does it work?
It isn't really dereferencing the null pointer. You should look on the whole code. What the code says is: take number 0
, treat it as struct hi *
, select element b
in the struct it points to, and take address of this element. The result of this operation would be the offset of the element b
from the beginning of the struct. When you add it to the pointer, you get element b
which equals to 4
.
This gives you the offset in bytes of the b
field inside the hi
struct
((struct hi *)0)
is a pointer to a hi
struct, starting at address 0
.
(((struct hi *)0)->b)
is the b
field of the above struct
& (((struct hi *)0)->b)
is the address of the above field. Because the hi
struct is located at address 0
, this is the offset of b
within the struct.
(unsigned int) & (((struct hi *)0)->b)
is a conversion of that from the address type to unsigned int
, so that it can be used as a number.
You're not actually dereferencing a NULL
pointer. You're just doing pointer arithmetic.
Accessing (((struct hi *)0)->b)
will give you a segmentation fault because you're trying to access a forbidden memory location.
Using & (((struct hi *)0)->b)
does not give you segmentation fault because you're only taking the address of that forbidden memory location, but you're not trying to access said location.
This is not an "and", this is taking the address of the right hand side argument.
This is a standard hack to get the offset of a struct member at run time. You are casting 0 to a pointer to struct hi, then referencing the 'b' member and getting its address. Then you add this offset to the pointer "ptr" and getting real address of the 'b' field of the struct pointed to by ptr, which is ob. Then you cast that pointer back to int pointer (because b is int) and output it.
This is the 2nd print.
The first print outputs num, which is 4 not because b's value is 4, but because 4 is the offset of the b field in hi struct. Which is sizeof(int), because b follows a, and a is int...
Hope this makes sense :)
You must be using a 32-bit compile (or a 64-bit compile on Windows).
The first expression - for num
- is a common implementation of the offsetof
macro from <stddef.h>
; it is not portable, but it often works.
The second expression adds that to 0 (the null pointer) and gives you the same answer - 4.
The second expression adds 4 to the base address of the object that ptr
points to, and that is the value 4 in the structure.
Your output does not include a newline - it probably should (the behaviour is not completely portable because it is implementation defined if you don't include the newline: C99 §7.19.2: "Whether the last line requires a terminating new-line character is implementation-defined."). On a Unix box, it is messy because the next prompt will appear immediately after the 44.
Just to clarify that you must understand the difference between NULL-pointer dereference and when it's not considered a de-reference. The spec actually dictates that the de-reference does not happen, and is actually optimised away when you have the & (address-of) operator in the expression.
So the &((struct T*)0)->b) actually optimises out the -> and just jumps that number of bytes from offset 0 and assumes it's a struct T *. This really obfuscates things for new beginners. However, it's widely used in the Linux Kernel - and provides an actual sense of list_entry, list_head's and various pointer arithmetic magic that newbies can't comprehend.
In any event, it's a programmatic way of finding the offset of 'b' within the struct T object. It's used in offsetof as well as other list_head operations such as list_entry.
For more information - you can read about this within Robert Love's Book titled "Linux Kernel Development".
精彩评论