Questions about "this pointer adjustor" in C++ object layout
I am kind of confused by one question: Under what cirumstances does the MS VC++ compiler generate a this adjustor? Notice that the this adjustor is not necessarily in a thunk. Below is my test code.
class myIUnknown
{
public:
virtual void IUnknown_method1(void)=0;
virtual void IUnknown_method2(void)=0;
int data_unknown_1;
int data_unknown_2;
};
class BaseX:public myIUnknown
{
public:
BaseX(int);
virtual void base_x_method1(void)=0;
virtual void base_x_method2(void)=0;
int data_base_x;
int data_unknown_1;
int data_unknown_2;
};
class BaseY:public myIUnknown
{
public:
BaseY(int);
virtual void base_y_method1(void);
virtual void base_y_method2(void)=0;
int data_base_y;
int data_unknown_1;
int data_unknown_2;
};
class ClassA:public BaseX, public BaseY
{
public:
ClassA(void);
//myIUnknown
void IUnknown_method1(void);
开发者_JAVA百科 void IUnknown_method2(void);
//baseX
void base_x_method1(void) ;
void base_x_method2(void) ;
//baseY
//void base_y_method1(void) ;
void base_y_method2(void) ;
virtual void class_a_method(void);
int data_class_a;
int data_unknown_1;
int data_unknown_2;
};
The object layout is as below:
1> class ClassA size(60):
1> +---
1> | +--- (base class BaseX)
1> | | +--- (base class myIUnknown)
1> 0 | | | {vfptr}
1> 4 | | | data_unknown_1
1> 8 | | | data_unknown_2
1> | | +---
1> 12 | | data_base_x
1> 16 | | data_unknown_1
1> 20 | | data_unknown_2
1> | +---
1> | +--- (base class BaseY)
1> | | +--- (base class myIUnknown)
1> 24 | | | {vfptr}
1> 28 | | | data_unknown_1
1> 32 | | | data_unknown_2
1> | | +---
1> 36 | | data_base_y
1> 40 | | data_unknown_1
1> 44 | | data_unknown_2
1> | +---
1> 48 | data_class_a
1> 52 | data_unknown_1
1> 56 | data_unknown_2
1> +---
1>
1> ClassA::$vftable@BaseX@:
1> | &ClassA_meta
1> | 0
1> 0 | &ClassA::IUnknown_method1
1> 1 | &ClassA::IUnknown_method2
1> 2 | &ClassA::base_x_method1
1> 3 | &ClassA::base_x_method2
1> 4 | &ClassA::class_a_method
1>
1> ClassA::$vftable@BaseY@:
1> | -24
1> 0 | &thunk: this-=24; goto ClassA::IUnknown_method1 <=====in-thunk "this adjustor"
1> 1 | &thunk: this-=24; goto ClassA::IUnknown_method2 <=====in-thunk "this adjustor"
1> 2 | &BaseY::base_y_method1
1> 3 | &ClassA::base_y_method2
1>
1> ClassA::IUnknown_method1 this adjustor: 0
1> ClassA::IUnknown_method2 this adjustor: 0
1> ClassA::base_x_method1 this adjustor: 0
1> ClassA::base_x_method2 this adjustor: 0
1> ClassA::base_y_method2 this adjustor: 24 <============non-in-thunk "this adjustor"
1> ClassA::class_a_method this adjustor: 0
And I found that in the following invokation, this pointer adjustors is generated:
in-thunk this adjustor:
pY->IUnknown_method1();//adjustor this! this-=24 pY-24==>pA
pY->IUnknown_method2();//adjustor this! this-=24 pY-24==>pA
non-in-thunk this adjustor:
pA->base_y_method2();//adjustor this! this+=24 pA+24==>pY
Could anyone tell me why the compiler produce this adjustor in the above invocations?
Under what cirumstances will the compiler generate the this adjustor?
Many thanks.
Perhaps it's easiest to start by thinking how single inheritance is (typically) implemented in C++. Consider a hierarchy that includes at least one virtual function:
struct Base {
int x;
virtual void f() {}
virtual ~Base() {}
};
struct Derived : Base {
int y;
virtual void f() {}
virtual ~Derived() {}
};
In a typical case, this will be implemented by having a vtable for each class, and create each object with a (hidden) vtable pointer. The vtable pointer for each object (of either Base or Derived class) will have the vtable pointer at the same offset in the structure, and each will contain the pointers to the virtual function (f
and the dtor) at the same offsets in the virtual table.
Now, consider polymorphic use of these types, such as:
void g(Base&b) {
b.f();
}
Since both Base and Derived (and any other derivatives of Base) all have the vtable arranged the same way, and a pointer to the vtable at the same offset in the structure, the compiler can generate exactly the same code for this, regardless of whether it's dealing with a Base, a Derived, or something else derived from Base.
When you add multiple inheritance to the mix, however, this changes. In particular, you can't arrange all your objects so the pointer to the vtable is always at the same offset in every object, for the simple reason that an object that's derived from two base classes will (potentially) have pointers to two separate vtables, which clearly can't be at the same offset in the structure (i.e., you can't put two different things in exactly the same place). To accommodate this, you have to do some sort of explicit adjustment. Each multiply derived class has to have some way for the compiler to find the vtables for all the base classes. Consider something like this:
struct Base1 {
virtual void f() { }
};
struct Base2 {
virtual void g() {}
};
class Derived1 : Base1, Base2 {
virtual void f() {}
virtual void g() {}
};
class Derived2 : Base2, Base1 {
virtual void f() {}
virtual void g() {}
};
In a typical case, the compiler will arrange the vtable pointers in the same order you specify the base classes, so Derived1 will have a pointer to Base1's vtable followed by a pointer to Base2's vtable. Derived2 will reverse the order.
Now, assuming the same function that does a polymorphic call to f()
, but is going to be passed a reference to a Base1, or a Derived1, or a Derived2. One of those will almost inevitably have its pointer to Base1's vtable at a different offset than the others. This is where the "this-adjustor" (or whatever you prefer to call it) comes in -- it finds the correct offset for the base class you're trying to use, so when you access members of that class, you get the right data.
Note that while I've used the pointer to the vtable as the primary example here, it's not the only possibility. In fact, even if you have no virtual functions in any of the classes, you still need access to the data for each base class, which requires the same kind of adjustment.
I've been doing C++ for well over a decade and have never needed to worry about any of this. However, it looks like you "this adjustor" comes into play during MI for classes that are not at the beginning of the structure.
It's a virtual-virtual step.
Think of the table as a virtual vtable (as opposed to just a vtable). The virtual-virtual step requires some calculation: given a this pointer, calculate the vtable. (Or, in this case, given a vtable, calculate another vtable.) That calculation is performed by the thunk. But if you don't need to perform a virtual operation, then you don't need to find the other vtable, and you don't need to perform the calculation, so you don't need the thunk. That's why some steps are mere offsets and others are implemented as thunks. It's that virtual-virtual step.
You may also wish to review my paper on the MS C++ Object Mapping, "C++: Under the Hood", still available here.
Happy hacking!
精彩评论