Real thing about "->" and "."

2023-01-01 17:31 问答作者：

I always wanted to know what is the real thing difference of how the compiler see a pointer to a struct (in C suppose) and a struct itself.

struct person p;
struct person *pp;

pp->age, I always imagine that the compiler does: "value of pp + offset of atribute "age" in the struct".

But what it does with person.p? It would be almost the same. For me "the programmer", p is no开发者_StackOverflow中文版t a memory address, its like "the structure itself", but of course this is not how the compiler deal with it.

My guess is it's more of a syntactic thing, and the compiler always does (&p)->age.

I'm correct?

p->q is essentially syntactic sugar for (*p).q in that it dereferences the pointer p and then goes to the proper field q within it. It saves typing for a very common case (pointers to structs).

In essence, -> does two deferences (pointer dereference, field dereference) while . only does one (field dereference).

Due to the multiple-dereference factor, -> can't be completely replaced with a static address by the compiler and will always include at least address computation (pointers can change dynamically at runtime, thus the locations will also change), whereas in some cases, . operations can be replaced by the compiler with an access to a fixed address (since the base struct's address can be fixed as well).

Updated (see comments):

You have the right idea, but there is an important difference for global and static variables only: when the compiler sees p.age for a global or static variable, it can replace it, at compile time, with the exact address of the age field within the struct.

In contrast, pp->age must be compiled as "value of pp + offset of age field", since the value of pp can change at runtime.

The two statements are not equivalent, even from the "compiler perspective". The statement p.age translates to the address of p + the offset of age, while pp->age translates to the address contained in pp + the offset of age.

The address of a variable and the address contained in a (pointer) variable are very different things.

Say the offset of age is 5. If p is a structure, its address might be 100, so p.age references address 105.

But if pp is a pointer to a structure, its address might be 100, but the value stored at address 100 is not the beginning of a person structure, it's a pointer. So the value at address 100 (the address contained in pp) might be, for example, 250. In that case, pp->age references address 255, not 105.

Since p is a local (automatic) variable, it is stored in the stack. Therefore the compiler accesses it in terms of offset with regard to the stack pointer (SP) or frame pointer (FP or BP, in architectures where it exists). In contrast, *p refers to a memory address [usually] allocated in the heap, so the stack registers are not used.

This is a question I've always asked myself.

v.x, the member operator, is valid only for structs. v->x, the member of pointer operator, is valid only for struct pointers.

So why have two different operators, since only one is needed? For example, only the . operator could be used; the compiler always knows the type of v, so it knows what to do: v.x if v is a struct, (*v).x if v is a struct pointer.

I have three theories:

temporary shortsightedness by K&R (which theory I'd like to be false)
making the job easier for the compiler (a practical theory, given the conception time of C :)
readability (which theory I prefer)

Unfortunately, I don't know which one (if any) is true.

In both cases the structure and its members are addressed by

address(person) + offset(age)

Using p with a struct stored in the stack memory gives the compiler more options to optimize memory usage. It could store the age only, instead of the whole struct if nothing else is used - this makes addressing with the above function fail (I think reading the address of a struct stops this optimization).
A struct on the stack may have no memory address at all. If the struct is small enough and only lives a short time it can be mapped to some of the processors registers (same as for the optimization above for reading the address).

The short answer: when the compiler does not optimize you are right. As soon as the compiler starts optimizing only what the c standard specifies is guaranteed.

Edit: Removed flawed stack/heap location for "pp->" since the pointed to struct can be on both heap and stack.

继续阅读：c compiler-construction

Real thing about "->" and "."

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？