Is there any penalty/cost of virtual inheritance in C++, when calling non-virtual base method?
Does using virtual inheritance in C++ have a runtime pe开发者_StackOverflow社区nalty in compiled code, when we call a regular function member from its base class? Sample code:
class A {
public:
void foo(void) {}
};
class B : virtual public A {};
class C : virtual public A {};
class D : public B, public C {};
// ...
D bar;
bar.foo ();
There may be, yes, if you call the member function via a pointer or reference and the compiler can't determine with absolute certainty what type of object that pointer or reference points or refers to. For example, consider:
void f(B* p) { p->foo(); }
void g()
{
D bar;
f(&bar);
}
Assuming the call to f
is not inlined, the compiler needs to generate code to find the location of the A
virtual base class subobject in order to call foo
. Usually this lookup involves checking the vptr/vtable.
If the compiler knows the type of the object on which you are calling the function, though (as is the case in your example), there should be no overhead because the function call can be dispatched statically (at compile time). In your example, the dynamic type of bar
is known to be D
(it can't be anything else), so the offset of the virtual base class subobject A
can be computed at compile time.
Yes, virtual inheritance has a run-time performance overhead. This is because the compiler, for any pointer/reference to object, cannot find it's sub-objects at compile-time. In constrast, for single inheritance, each sub-object is located at a static offset of the original object. Consider:
class A { ... };
class B : public A { ... }
The memory layout of B looks a little like this:
| B's stuff | A's stuff |
In this case, the compiler knows where A is. However, now consider the case of MVI.
class A { ... };
class B : public virtual A { ... };
class C : public virtual A { ... };
class D : public C, public B { ... };
B's memory layout:
| B's stuff | A's stuff |
C's memory layout:
| C's stuff | A's stuff |
But wait! When D is instantiated, it doesn't look like that.
| D's stuff | B's stuff | C's stuff | A's stuff |
Now, if you have a B*, if it really points to a B, then A is right next to the B- but if it points to a D, then in order to obtain A* you really need to skip over the C sub-object, and since any given B*
could point to a B or a D dynamically at run-time, then you will need to alter the pointer dynamically. This, at the minimum, means that you will have to produce code to find that value by some means, as opposed to having the value baked-in at compile-time, which is what occurs for single inheritance.
At least in a typical implementation, virtual inheritance carries a (small!) penalty for (at least some) access to data members. In particular, you normally end up with an extra level of indirection to access the data members of the object from which you've derived virtually. This comes about because (at least in the normal case) two or more separate derived classes have not just the same base class, but the same base class object. To accomplish this, both of the derived classes have pointers to the same offset into the most derived object, and access those data members via that pointer.
Although it's technically not due to virtual inheritance, it's probably worth noting that there's a separate (again, small) penalty for multiple inheritance in general. In a typical implementation of single inheritance, you have a vtable pointer at some fixed offset in the object (quite often the very beginning). In the case of multiple inheritance, you obviously can't have two vtable pointers at the same offset, so you end up with a number of vtable pointers, each at a separate offset in the object.
IOW, the vtable pointer with single inheritance is normally just static_cast<vtable_ptr_t>(object_address)
, but with multiple inheritance you get static_cast<vtable_ptr_t>(object_address+offset)
.
Technically, the two are entirely separate -- but of course nearly the only use for virtual inheritance is in conjunction with multiple inheritance, so it's semi-relevant anyway.
Concretely in Microsoft Visual C++ there is an actual difference in pointer-to-member sizes. See #pragma pointers_to_members. As you can see in that listing - the most general method is "virtual inheritance" which is distinct from multiple inheritance which in turn is distinct from single inheritance.
That implies that more information is needed to resolve a pointer-to-member in the case of presence of virtual inheritance, and it will have a performance impact if only through the amount of data taken up in the CPU cache - though likely also in the length of the lookup of the member or the number of jumps needed.
I think, there is no runtime penalty for virtual inheritance. Don't confuse virtual inheritance with virtual functions. Both are two different things.
virtual inheritance ensures that you've only one sub-object A
in instances of D
. So I don't think there would be runtime penalty for it alone.
However, there can arise cases where this sub-object cannot be known at compile time, so in such cases there would runtime penalty for virtual inheritance. One such case is described by James in his answer.
Your question is focused mostly on calling regular functions of the virtual base, not the (far) more interesting case of virtual functions of the virtual base class (class A in your example)-- but yes, there can be a cost. Of course everything is compiler dependent.
When the compiler compiled A::foo, it assumed that "this" points to the start of where the data members for A resides in memory. At this time, the compiler might not know that class A will be a virtual base of any other class. But it happily generates the code.
Now, when the compiler compiles B, there won't really be a change because while A is a virtual base class, it is still single inheritance and in the typical case, the compiler will layout class B by placing class A's data members immediately followed by class B's data members-- so a B * can be immediately castable to a A * without any change in value, and hence, the no adjustments need to be made. The compiler can call A::foo using the same "this" pointer (even though it is of type B *) and there is no harm.
The same situation is for class C-- its still single inheritance, and the typical compiler will place A's data members immediately followed by C's data members so a C * can be immediately castable to an A * without any change in value. Thus, the compiler can simply call A::foo with the same "this" pointer (even though it is of type C*) and there is no harm.
However, the situation is totally different for class D. The layout of class D will typically be class A's data members, followed by class B's data members, followed by class C's data members, followed by class D's data members.
Using the typical layout, a D * can be immediately convertable to an A *, so there is no penalty for A::foo-- the compiler can call the same routine it generated for A::foo without any change to "this" and everything is fine.
However, the situation changes if the compiler needs to call a member function such as C::other_member_func, even if C::other_member_func is non-virtual. The reason is that when the compiler wrote the code for C::other_member_func, it assumed that the data layout referenced by the "this" pointer is A's data members immediately followed by C's data members. But that is not true for an instance of D. The compiler may need to rewrite and create a (non-virtual) D::other_member_func, just to take care of the class instance memory layout difference.
Note that this is a different but similar situation when using multiple inheritance, but in multiple inheritance without virtual bases, the compiler can take care of everything by simply adding a displacement or fixup to the "this" pointer to account for where a base class is "embedded" within an instance of a derived class. But with virtual bases, sometimes a function rewrite is needed. It all depends on what data members are accessed by the (even non-virtual) member function being called.
For example, if class C defined a non-virtual member function C::some_member_func, the compiler might need to write:
- C::some_member_func when called from an actual instance of C (and not D), as determined at compile time (because some_member_func isn't a virtual function)
- C::some_member_func when the same member function is called from an actual instance of class D, as determined at compile time. (Technically this routine is D::some_member_func. Even though the definition of this member function is implicit and identical to the source code of C::some_member_func, the generated object code may be slightly different.)
if the code for C::some_member_func happens to use member variables defined in both class A and class C.
There has to be a cost to virtual-inheritance.
The proof is that virtually inherited classes occupy more than the sum of the parts.
Typical case:
struct A{double a;};
struct B1 : virtual A{double b1;};
struct B2 : virtual A{double b2;};
struct C : virtual B1, virtual B2{double c;}; // I think these virtuals are not strictly necessary
static_assert( sizeof(A) == sizeof(double) ); // as expected
static_assert( sizeof(B1) > sizeof(A) + sizeof(double) ); // the equality holds for non-virtual inheritance
static_assert( sizeof(B2) > sizeof(A) + sizeof(double) ); // the equality holds for non-virtual inheritance
static_assert( sizeof(C) > sizeof(A) + sizeof(double) + sizeof(double) + sizeof(double) );
static_assert( sizeof(C) > sizeof(A) + sizeof(double) + sizeof(double) + sizeof(double) + sizeof(double));
(https://godbolt.org/z/zTcfoY)
What is stored additionally? I don't exactly understand. I think it is something like a virtual table but for accessing individual members.
There is a cost of additional memory. For example, GCC 7 on x86-64 gives following results:
#include <iostream>
class A { int a; };
class B: public A { int b; };
class C: public A { int c; };
class D: public B, public C { int d; };
class BV: virtual public A { int b; };
class CV: virtual public A { int c; };
class DV: public BV, public CV { int d; };
int main()
{
std::cout << sizeof(A) << std::endl;
std::cout << sizeof(B) << std::endl;
std::cout << sizeof(C) << std::endl;
std::cout << sizeof(D) << std::endl;
std::cout << sizeof(BV) << std::endl;
std::cout << sizeof(CV) << std::endl;
std::cout << sizeof(DV) << std::endl;
return 0;
}
This prints out:
4
8
8
20
16
16
40
As you can see, some extra bytes added when you use virtual inheritance.
Well, after many good answers explaining, while looking up the exact position of the virtual base class in memory incurs a performance penalty, there is a follow up question: "Can this penalty be reduced?" Fortunately, there is a partial solution in form of the (not yet mentioned) final
keyword. In particular, calls from the class D
of the original example to the innermost base A
can usually be (almost) penalty-free, but in the general case only, if you final
ize D
.
For why this is necessary, let's look at a multilevel class hierarchy:
class Base {};
class ExtA : public virtual Base {};
class ExtB : public virtual Base {};
class ExtC : public virtual Base {};
class App1 : public Base {};
class App2 : public ExtA {};
class App3 : public ExtB, public ExtC {};
class SuperApp : public App2, public App3 {};
Because our App
lication classes can use various of the Ext
ension classes of our base class, none of those Ext
ension classes can know at compile time, where the Base
subobject will be located within the object, that they are called with. Rather, they have to consult the virtual table at runtime to find out. This is, because the various Ext
and App
classes can all be defined in different translation units.
But the same problem exists for the App
lication classes: Because App2
and App3
inherit a virtualized Base
via the Ext
ension class(es), they don't know at compile time, where that Base
subobject is located within their own objects. So each method of App2
or App3
has to consult the virtual table to find the location of the Base
subobject within their local objects. This is, because it is syntactically legal to later combine those App
classes further, as illustrated with the SuperApp
class in the above hierarchy.
Also note, that there is a further penalty, if the Base
class calls any virtual methods defined on the Ext
ension or App
lication level. That's because the virtual method will be called with this
pointing to a Base
object, but they have to adjust this to the beginning of their own object by again consulting the virtual table. If an Ext
ension or App
lication layer (virtual or non-virtual) method calls a virtual method defined on the Base
class, that penalty is incurred twice: First for finding the Base
subobject and then again for finding the real object relative from the Base
subobject.
However, if we know, that a SuperApp
combining several App
s won't be created, we can improve things a lot by declaring the App
classes final:
class App1 final : public Base {};
class App2 final : public ExtA {};
class App3 final : public ExtB, public ExtC {};
// class SuperApp : public App2, public App3 {}; // illegal now!
Because final
makes the layout immutable, methods of the App
lication classes don't need to go through a virtual table to find the Base
subobject anymore. They just add the known constant offet to the this
pointer, when calling any Base
method. And virtual callbacks at the App
lication layer can fixup the this
pointer easily again by subtracting a constant known offset (or even not fix it up at all and reference the various fields from the middle of the object instead). Methods of the Base
class also don't incur any penalty upon themselves, because inside that class, everything works normal. So in this three-level scenario with final
ized classes on the outmost level, only the execution of methods on the Ext
ensions level is slower, if they need to refer to fields or methods of the Base
class, or if they are virtually called from the Base
.
The backdraw of the final
keyword is, that it disallows all extensions. You cannot derive an App2a
from App2
anymore, even, if it doesn't require any of those Ext
ensions. And declaring a non-final
App2Base
and then final
App2a
and App2b
from it, would again incur penalties for all the methods in App2Base
, that refer to the original Base
. Unfortunately, the C++ Gods didn't give us a way to just unvirtualize a base class, but leave non-virtual extensions possible. They also didn't give us a way to declare a "master" Ext
ension class, whose layout stays fixed, even if other Ext
ensions with the same virtual Base
class are also added (in this case, all the non-master Ext
ensions would refer to the Base
subobject within the master Ext
ension).
The alternative to virtual inheritance like this is usually to add all the extension stuff to the Base
class. Depending on the application, that might require a lot of extra and often unused fields and/or a lot of extra virtual method calls and/or a lot of dynamic_cast
s, which all come with a performance penalty, too.
Also note, that in modern CPUs, the penalty after a mispredicted virtual function call is much higher than the penalty after a mispredicted this
pointer fixup. The first needs to throw away all results obtained on the wrong execution path and restart afresh on the right path. The later still needs to repeat all opcodes depending directly or indirectly on this
, but doesn't need to load and decode instructions again. BTW: The speculative execution with unknown pointer fixups is one of the reasons, why CPUs are vulnerable to Spectre/Meltdown type data leaks.
精彩评论