What are Aggregates and PODs and how/why are they special?
This FAQ is about Aggregates and PODs and covers the following material:
- What are Aggregates?
- What are PODs (Plain Old Data)?
- How are they related?
- How and w开发者_运维百科hy are they special?
- What changes for C++11?
How to read:
This article is rather long. If you want to know about both aggregates and PODs (Plain Old Data) take time and read it. If you are interested just in aggregates, read only the first part. If you are interested only in PODs then you must first read the definition, implications, and examples of aggregates and then you may jump to PODs but I would still recommend reading the first part in its entirety. The notion of aggregates is essential for defining PODs. If you find any errors (even minor, including grammar, stylistics, formatting, syntax, etc.) please leave a comment, I'll edit.
This answer applies to C++03. For other C++ standards see:
- C++11 changes
- C++14 changes
- C++17 changes
- C++20 changes
What are aggregates and why they are special
Formal definition from the C++ standard (C++03 8.5.1 §1):
An aggregate is an array or a class (clause 9) with no user-declared constructors (12.1), no private or protected non-static data members (clause 11), no base classes (clause 10), and no virtual functions (10.3).
So, OK, let's parse this definition. First of all, any array is an aggregate. A class can also be an aggregate if… wait! nothing is said about structs or unions, can't they be aggregates? Yes, they can. In C++, the term class
refers to all classes, structs, and unions. So, a class (or struct, or union) is an aggregate if and only if it satisfies the criteria from the above definitions. What do these criteria imply?
This does not mean an aggregate class cannot have constructors, in fact it can have a default constructor and/or a copy constructor as long as they are implicitly declared by the compiler, and not explicitly by the user
No private or protected non-static data members. You can have as many private and protected member functions (but not constructors) as well as as many private or protected static data members and member functions as you like and not violate the rules for aggregate classes
An aggregate class can have a user-declared/user-defined copy-assignment operator and/or destructor
An array is an aggregate even if it is an array of non-aggregate class type.
Now let's look at some examples:
class NotAggregate1
{
virtual void f() {} //remember? no virtual functions
};
class NotAggregate2
{
int x; //x is private by default and non-static
};
class NotAggregate3
{
public:
NotAggregate3(int) {} //oops, user-defined constructor
};
class Aggregate1
{
public:
NotAggregate1 member1; //ok, public member
Aggregate1& operator=(Aggregate1 const & rhs) {/* */} //ok, copy-assignment
private:
void f() {} // ok, just a private function
};
You get the idea. Now let's see how aggregates are special. They, unlike non-aggregate classes, can be initialized with curly braces {}
. This initialization syntax is commonly known for arrays, and we just learnt that these are aggregates. So, let's start with them.
Type array_name[n] = {a1, a2, …, am};
if(m == n)
the ith element of the array is initialized with ai
else if(m < n)
the first m elements of the array are initialized with a1, a2, …, am and the other n - m
elements are, if possible, value-initialized (see below for the explanation of the term)
else if(m > n)
the compiler will issue an error
else (this is the case when n isn't specified at all like int a[] = {1, 2, 3};
)
the size of the array (n) is assumed to be equal to m, so int a[] = {1, 2, 3};
is equivalent to int a[3] = {1, 2, 3};
When an object of scalar type (bool
, int
, char
, double
, pointers, etc.) is value-initialized it means it is initialized with 0
for that type (false
for bool
, 0.0
for double
, etc.). When an object of class type with a user-declared default constructor is value-initialized its default constructor is called. If the default constructor is implicitly defined then all nonstatic members are recursively value-initialized. This definition is imprecise and a bit incorrect but it should give you the basic idea. A reference cannot be value-initialized. Value-initialization for a non-aggregate class can fail if, for example, the class has no appropriate default constructor.
Examples of array initialization:
class A
{
public:
A(int) {} //no default constructor
};
class B
{
public:
B() {} //default constructor available
};
int main()
{
A a1[3] = {A(2), A(1), A(14)}; //OK n == m
A a2[3] = {A(2)}; //ERROR A has no default constructor. Unable to value-initialize a2[1] and a2[2]
B b1[3] = {B()}; //OK b1[1] and b1[2] are value initialized, in this case with the default-ctor
int Array1[1000] = {0}; //All elements are initialized with 0;
int Array2[1000] = {1}; //Attention: only the first element is 1, the rest are 0;
bool Array3[1000] = {}; //the braces can be empty too. All elements initialized with false
int Array4[1000]; //no initializer. This is different from an empty {} initializer in that
//the elements in this case are not value-initialized, but have indeterminate values
//(unless, of course, Array4 is a global array)
int array[2] = {1, 2, 3, 4}; //ERROR, too many initializers
}
Now let's see how aggregate classes can be initialized with braces. Pretty much the same way. Instead of the array elements we will initialize the non-static data members in the order of their appearance in the class definition (they are all public by definition). If there are fewer initializers than members, the rest are value-initialized. If it is impossible to value-initialize one of the members which were not explicitly initialized, we get a compile-time error. If there are more initializers than necessary, we get a compile-time error as well.
struct X
{
int i1;
int i2;
};
struct Y
{
char c;
X x;
int i[2];
float f;
protected:
static double d;
private:
void g(){}
};
Y y = {'a', {10, 20}, {20, 30}};
In the above example y.c
is initialized with 'a'
, y.x.i1
with 10
, y.x.i2
with 20
, y.i[0]
with 20
, y.i[1]
with 30
and y.f
is value-initialized, that is, initialized with 0.0
. The protected static member d
is not initialized at all, because it is static
.
Aggregate unions are different in that you may initialize only their first member with braces. I think that if you are advanced enough in C++ to even consider using unions (their use may be very dangerous and must be thought of carefully), you could look up the rules for unions in the standard yourself :).
Now that we know what's special about aggregates, let's try to understand the restrictions on classes; that is, why they are there. We should understand that memberwise initialization with braces implies that the class is nothing more than the sum of its members. If a user-defined constructor is present, it means that the user needs to do some extra work to initialize the members therefore brace initialization would be incorrect. If virtual functions are present, it means that the objects of this class have (on most implementations) a pointer to the so-called vtable of the class, which is set in the constructor, so brace-initialization would be insufficient. You could figure out the rest of the restrictions in a similar manner as an exercise :).
So enough about the aggregates. Now we can define a stricter set of types, to wit, PODs
What are PODs and why they are special
Formal definition from the C++ standard (C++03 9 §4):
A POD-struct is an aggregate class that has no non-static data members of type non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor. Similarly, a POD-union is an aggregate union that has no non-static data members of type non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor. A POD class is a class that is either a POD-struct or a POD-union.
Wow, this one's tougher to parse, isn't it? :) Let's leave unions out (on the same grounds as above) and rephrase in a bit clearer way:
An aggregate class is called a POD if it has no user-defined copy-assignment operator and destructor and none of its nonstatic members is a non-POD class, array of non-POD, or a reference.
What does this definition imply? (Did I mention POD stands for Plain Old Data?)
- All POD classes are aggregates, or, to put it the other way around, if a class is not an aggregate then it is sure not a POD
- Classes, just like structs, can be PODs even though the standard term is POD-struct for both cases
- Just like in the case of aggregates, it doesn't matter what static members the class has
Examples:
struct POD
{
int x;
char y;
void f() {} //no harm if there's a function
static std::vector<char> v; //static members do not matter
};
struct AggregateButNotPOD1
{
int x;
~AggregateButNotPOD1() {} //user-defined destructor
};
struct AggregateButNotPOD2
{
AggregateButNotPOD1 arrOfNonPod[3]; //array of non-POD class
};
POD-classes, POD-unions, scalar types, and arrays of such types are collectively called POD-types.
PODs are special in many ways. I'll provide just some examples.
POD-classes are the closest to C structs. Unlike them, PODs can have member functions and arbitrary static members, but neither of these two change the memory layout of the object. So if you want to write a more or less portable dynamic library that can be used from C and even .NET, you should try to make all your exported functions take and return only parameters of POD-types.
The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused.
For objects of POD types it is guaranteed by the standard that when you
memcpy
the contents of your object into an array of char or unsigned char, and thenmemcpy
the contents back into your object, the object will hold its original value. Do note that there is no such guarantee for objects of non-POD types. Also, you can safely copy POD objects withmemcpy
. The following example assumes T is a POD-type:#define N sizeof(T) char buf[N]; T obj; // obj initialized to its original value memcpy(buf, &obj, N); // between these two calls to memcpy, // obj might be modified memcpy(&obj, buf, N); // at this point, each subobject of obj of scalar type // holds its original value
goto statement. As you may know, it is illegal (the compiler should issue an error) to make a jump via goto from a point where some variable was not yet in scope to a point where it is already in scope. This restriction applies only if the variable is of non-POD type. In the following example
f()
is ill-formed whereasg()
is well-formed. Note that Microsoft's compiler is too liberal with this rule—it just issues a warning in both cases.int f() { struct NonPOD {NonPOD() {}}; goto label; NonPOD x; label: return 0; } int g() { struct POD {int i; char c;}; goto label; POD x; label: return 0; }
It is guaranteed that there will be no padding in the beginning of a POD object. In other words, if a POD-class A's first member is of type T, you can safely
reinterpret_cast
fromA*
toT*
and get the pointer to the first member and vice versa.
The list goes on and on…
Conclusion
It is important to understand what exactly a POD is because many language features, as you see, behave differently for them.
What changes for C++11?
Aggregates
The standard definition of an aggregate has changed slightly, but it's still pretty much the same:
An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no brace-or-equal-initializers for non-static data members (9.2), no private or protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).
Ok, what changed?
Previously, an aggregate could have no user-declared constructors, but now it can't have user-provided constructors. Is there a difference? Yes, there is, because now you can declare constructors and default them:
struct Aggregate { Aggregate() = default; // asks the compiler to generate the default implementation };
This is still an aggregate because a constructor (or any special member function) that is defaulted on the first declaration is not user-provided.
Now an aggregate cannot have any brace-or-equal-initializers for non-static data members. What does this mean? Well, this is just because with this new standard, we can initialize members directly in the class like this:
struct NotAggregate { int x = 5; // valid in C++11 std::vector<int> s{1,2,3}; // also valid };
Using this feature makes the class no longer an aggregate because it's basically equivalent to providing your own default constructor.
So, what is an aggregate didn't change much at all. It's still the same basic idea, adapted to the new features.
What about PODs?
PODs went through a lot of changes. Lots of previous rules about PODs were relaxed in this new standard, and the way the definition is provided in the standard was radically changed.
The idea of a POD is to capture basically two distinct properties:
- It supports static initialization, and
- Compiling a POD in C++ gives you the same memory layout as a struct compiled in C.
Because of this, the definition has been split into two distinct concepts: trivial classes and standard-layout classes, because these are more useful than POD. The standard now rarely uses the term POD, preferring the more specific trivial and standard-layout concepts.
The new definition basically says that a POD is a class that is both trivial and has standard-layout, and this property must hold recursively for all non-static data members:
A POD struct is a non-union class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). Similarly, a POD union is a union that is both a trivial class and a standard layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). A POD class is a class that is either a POD struct or a POD union.
Let's go over each of these two properties in detail separately.
Trivial classes
Trivial is the first property mentioned above: trivial classes support static initialization.
If a class is trivially copyable (a superset of trivial classes), it is ok to copy its representation over the place with things like memcpy
and expect the result to be the same.
The standard defines a trivial class as follows:
A trivially copyable class is a class that:
— has no non-trivial copy constructors (12.8),
— has no non-trivial move constructors (12.8),
— has no non-trivial copy assignment operators (13.5.3, 12.8),
— has no non-trivial move assignment operators (13.5.3, 12.8), and
— has a trivial destructor (12.4).
A trivial class is a class that has a trivial default constructor (12.1) and is trivially copyable.
[ Note: In particular, a trivially copyable or trivial class does not have virtual functions or virtual base classes.—end note ]
So, what are all those trivial and non-trivial things?
A copy/move constructor for class X is trivial if it is not user-provided and if
— class X has no virtual functions (10.3) and no virtual base classes (10.1), and
— the constructor selected to copy/move each direct base class subobject is trivial, and
— for each non-static data member of X that is of class type (or array thereof), the constructor selected to copy/move that member is trivial;
otherwise the copy/move constructor is non-trivial.
Basically this means that a copy or move constructor is trivial if it is not user-provided, the class has nothing virtual in it, and this property holds recursively for all the members of the class and for the base class.
The definition of a trivial copy/move assignment operator is very similar, simply replacing the word "constructor" with "assignment operator".
A trivial destructor also has a similar definition, with the added constraint that it can't be virtual.
And yet another similar rule exists for trivial default constructors, with the addition that a default constructor is not-trivial if the class has non-static data members with brace-or-equal-initializers, which we've seen above.
Here are some examples to clear everything up:
// empty classes are trivial
struct Trivial1 {};
// all special members are implicit
struct Trivial2 {
int x;
};
struct Trivial3 : Trivial2 { // base class is trivial
Trivial3() = default; // not a user-provided ctor
int y;
};
struct Trivial4 {
public:
int a;
private: // no restrictions on access modifiers
int b;
};
struct Trivial5 {
Trivial1 a;
Trivial2 b;
Trivial3 c;
Trivial4 d;
};
struct Trivial6 {
Trivial2 a[23];
};
struct Trivial7 {
Trivial6 c;
void f(); // it's okay to have non-virtual functions
};
struct Trivial8 {
int x;
static NonTrivial1 y; // no restrictions on static members
};
struct Trivial9 {
Trivial9() = default; // not user-provided
// a regular constructor is okay because we still have default ctor
Trivial9(int x) : x(x) {};
int x;
};
struct NonTrivial1 : Trivial3 {
virtual void f(); // virtual members make non-trivial ctors
};
struct NonTrivial2 {
NonTrivial2() : z(42) {} // user-provided ctor
int z;
};
struct NonTrivial3 {
NonTrivial3(); // user-provided ctor
int w;
};
NonTrivial3::NonTrivial3() = default; // defaulted but not on first declaration
// still counts as user-provided
struct NonTrivial5 {
virtual ~NonTrivial5(); // virtual destructors are not trivial
};
Standard-layout
Standard-layout is the second property. The standard mentions that these are useful for communicating with other languages, and that's because a standard-layout class has the same memory layout of the equivalent C struct or union.
This is another property that must hold recursively for members and all base classes. And as usual, no virtual functions or virtual base classes are allowed. That would make the layout incompatible with C.
A relaxed rule here is that standard-layout classes must have all non-static data members with the same access control. Previously these had to be all public, but now you can make them private or protected, as long as they are all private or all protected.
When using inheritance, only one class in the whole inheritance tree can have non-static data members, and the first non-static data member cannot be of a base class type (this could break aliasing rules), otherwise, it's not a standard-layout class.
This is how the definition goes in the standard text:
A standard-layout class is a class that:
— has no non-static data members of type non-standard-layout class (or array of such types) or reference,
— has no virtual functions (10.3) and no virtual base classes (10.1),
— has the same access control (Clause 11) for all non-static data members,
— has no non-standard-layout base classes,
— either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
— has no base classes of the same type as the first non-static data member.
A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class.
A standard-layout union is a standard-layout class defined with the class-key union.
[ Note: Standard-layout classes are useful for communicating with code written in other programming languages. Their layout is specified in 9.2.—end note ]
And let's see a few examples.
// empty classes have standard-layout
struct StandardLayout1 {};
struct StandardLayout2 {
int x;
};
struct StandardLayout3 {
private: // both are private, so it's ok
int x;
int y;
};
struct StandardLayout4 : StandardLayout1 {
int x;
int y;
void f(); // perfectly fine to have non-virtual functions
};
struct StandardLayout5 : StandardLayout1 {
int x;
StandardLayout1 y; // can have members of base type if they're not the first
};
struct StandardLayout6 : StandardLayout1, StandardLayout5 {
// can use multiple inheritance as long only
// one class in the hierarchy has non-static data members
};
struct StandardLayout7 {
int x;
int y;
StandardLayout7(int x, int y) : x(x), y(y) {} // user-provided ctors are ok
};
struct StandardLayout8 {
public:
StandardLayout8(int x) : x(x) {} // user-provided ctors are ok
// ok to have non-static data members and other members with different access
private:
int x;
};
struct StandardLayout9 {
int x;
static NonStandardLayout1 y; // no restrictions on static members
};
struct NonStandardLayout1 {
virtual f(); // cannot have virtual functions
};
struct NonStandardLayout2 {
NonStandardLayout1 X; // has non-standard-layout member
};
struct NonStandardLayout3 : StandardLayout1 {
StandardLayout1 x; // first member cannot be of the same type as base
};
struct NonStandardLayout4 : StandardLayout3 {
int z; // more than one class has non-static data members
};
struct NonStandardLayout5 : NonStandardLayout3 {}; // has a non-standard-layout base class
Conclusion
With these new rules a lot more types can be PODs now. And even if a type is not POD, we can take advantage of some of the POD properties separately (if it is only one of trivial or standard-layout).
The standard library has traits to test these properties in the header <type_traits>
:
template <typename T>
struct std::is_pod;
template <typename T>
struct std::is_trivial;
template <typename T>
struct std::is_trivially_copyable;
template <typename T>
struct std::is_standard_layout;
What has changed for C++14
We can refer to the Draft C++14 standard for reference.
Aggregates
This is covered in section 8.5.1
Aggregates which gives us the following definition:
An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no private or protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).
The only change is now adding in-class member initializers does not make a class a non-aggregate. So the following example from C++11 aggregate initialization for classes with member in-place initializers:
struct A
{
int a = 3;
int b = 3;
};
was not an aggregate in C++11 but it is in C++14. This change is covered in N3605: Member initializers and aggregates, which has the following abstract:
Bjarne Stroustrup and Richard Smith raised an issue about aggregate initialization and member-initializers not working together. This paper proposes to fix the issue by adopting Smith's proposed wording that removes a restriction that aggregates can't have member-initializers.
POD stays the same
The definition for POD(plain old data) struct is covered in section 9
Classes which says:
A POD struct110 is a non-union class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). Similarly, a POD union is a union that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). A POD class is a class that is either a POD struct or a POD union.
which is the same wording as C++11.
Standard-Layout Changes for C++14
As noted in the comments pod relies on the definition of standard-layout and that did change for C++14 but this was via defect reports that were applied to C++14 after the fact.
There were three DRs:
- DR 1672
- DR 1813
- DR 2120
So standard-layout went from this Pre C++14:
A standard-layout class is a class that:
- (7.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
- (7.2) has no virtual functions ([class.virtual]) and no virtual base classes ([class.mi]),
- (7.3) has the same access control (Clause [class.access]) for all non-static data members,
- (7.4) has no non-standard-layout base classes,
- (7.5) either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
- (7.6) has no base classes of the same type as the first non-static data member.109
To this in C++14:
A class S is a standard-layout class if it:
- (3.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
- (3.2) has no virtual functions and no virtual base classes,
- (3.3) has the same access control for all non-static data members,
- (3.4) has no non-standard-layout base classes,
- (3.5) has at most one base class subobject of any given type,
- (3.6) has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
- (3.7) has no element of the set M(S) of types as a base class, where for any type X, M(X) is defined as follows.104 [ Note: M(X) is the set of the types of all non-base-class subobjects that may be at a zero offset in X. — end note ]
- (3.7.1) If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
- (3.7.2) If X is a non-union class type with a non-static data member of type X0 that is either of zero size or is the first non-static data member of X (where said member may be an anonymous union), the set M(X) consists of X0 and the elements of M(X0).
- (3.7.3) If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the ith non-static data member of X.
- (3.7.4) If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
- (3.7.5) If X is a non-class, non-array type, the set M(X) is empty.
Changes in C++17
Download the C++17 International Standard final draft here.
Aggregates
C++17 expands and enhances aggregates and aggregate initialization. The standard library also now includes an std::is_aggregate
type trait class. Here is the formal definition from section 11.6.1.1 and 11.6.1.2 (internal references elided):
An aggregate is an array or a class with
— no user-provided, explicit, or inherited constructors,
— no private or protected non-static data members,
— no virtual functions, and
— no virtual, private, or protected base classes.
[ Note: Aggregate initialization does not allow accessing protected and private base class’ members or constructors. —end note ]
The elements of an aggregate are:
— for an array, the array elements in increasing subscript order, or
— for a class, the direct base classes in declaration order, followed by the direct non-static data members that are not members of an anonymous union, in declaration order.
What changed?
- Aggregates can now have public, non-virtual base classes. Furthermore, it is not a requirement that base classes be aggregates. If they are not aggregates, they are list-initialized.
struct B1 // not a aggregate
{
int i1;
B1(int a) : i1(a) { }
};
struct B2
{
int i2;
B2() = default;
};
struct M // not an aggregate
{
int m;
M(int a) : m(a) { }
};
struct C : B1, B2
{
int j;
M m;
C() = default;
};
C c { { 1 }, { 2 }, 3, { 4 } };
cout
<< "is C aggregate?: " << (std::is_aggregate<C>::value ? 'Y' : 'N')
<< " i1: " << c.i1 << " i2: " << c.i2
<< " j: " << c.j << " m.m: " << c.m.m << endl;
//stdout: is C aggregate?: Y, i1=1 i2=2 j=3 m.m=4
- Explicit defaulted constructors are disallowed
struct D // not an aggregate
{
int i = 0;
D() = default;
explicit D(D const&) = default;
};
- Inheriting constructors are disallowed
struct B1
{
int i1;
B1() : i1(0) { }
};
struct C : B1 // not an aggregate
{
using B1::B1;
};
Trivial Classes
The definition of trivial class was reworked in C++17 to address several defects that were not addressed in C++14. The changes were technical in nature. Here is the new definition at 12.0.6 (internal references elided):
A trivially copyable class is a class:
— where each copy constructor, move constructor, copy assignment operator, and move assignment operator is either deleted or trivial,
— that has at least one non-deleted copy constructor, move constructor, copy assignment operator, or move assignment operator, and
— that has a trivial, non-deleted destructor.
A trivial class is a class that is trivially copyable and has one or more default constructors, all of which are either trivial or deleted and at least one of which is not deleted. [ Note: In particular, a trivially copyable or trivial class does not have virtual functions or virtual base classes.—end note ]
Changes:
- Under C++14, for a class to be trivial, the class could not have any copy/move constructor/assignment operators that were non-trivial. However, then an implicitly declared as defaulted constructor/operator could be non-trivial and yet defined as deleted because, for example, the class contained a subobject of class type that could not be copied/moved. The presence of such non-trivial, defined-as-deleted constructor/operator would cause the whole class to be non-trivial. A similar problem existed with destructors. C++17 clarifies that the presence of such constructor/operators does not cause the class to be non-trivially copyable, hence non-trivial, and that a trivially copyable class must have a trivial, non-deleted destructor. DR1734, DR1928
- C++14 allowed a trivially copyable class, hence a trivial class, to have every copy/move constructor/assignment operator declared as deleted. If such as class was also standard layout, it could, however, be legally copied/moved with
std::memcpy
. This was a semantic contradiction, because, by defining as deleted all constructor/assignment operators, the creator of the class clearly intended that the class could not be copied/moved, yet the class still met the definition of a trivially copyable class. Hence in C++17 we have a new clause stating that trivially copyable class must have at least one trivial, non-deleted (though not necessarily publicly accessible) copy/move constructor/assignment operator. See N4148, DR1734 - The third technical change concerns a similar problem with default constructors. Under C++14, a class could have trivial default constructors that were implicitly defined as deleted, yet still be a trivial class. The new definition clarifies that a trivial class must have a least one trivial, non-deleted default constructor. See DR1496
Standard-layout Classes
The definition of standard-layout was also reworked to address defect reports. Again the changes were technical in nature. Here is the text from the standard (12.0.7). As before, internal references are elided:
A class S is a standard-layout class if it:
— has no non-static data members of type non-standard-layout class (or array of such types) or reference,
— has no virtual functions and no virtual base classes,
— has the same access control for all non-static data members,
— has no non-standard-layout base classes,
— has at most one base class subobject of any given type,
— has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
— has no element of the set M(S) of types (defined below) as a base class.108
M(X) is defined as follows:
— If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
— If X is a non-union class type whose first non-static data member has type X0 (where said member may be an anonymous union), the set M(X) consists of X0 and the elements of M(X0).
— If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the ith non-static data member of X.
— If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
— If X is a non-class, non-array type, the set M(X) is empty.
[ Note: M(X) is the set of the types of all non-base-class subobjects that are guaranteed in a standard-layout class to be at a zero offset in X. —end note ]
[ Example:
—end example ]struct B { int i; }; // standard-layout class struct C : B { }; // standard-layout class struct D : C { }; // standard-layout class struct E : D { char : 4; }; // not a standard-layout class struct Q {}; struct S : Q { }; struct T : Q { }; struct U : S, T { }; // not a standard-layout class
108) This ensures that two subobjects that have the same class type and that belong to the same most derived object are not allocated at the same address.
Changes:
- Clarified that the requirement that only one class in the derivation tree "has" non-static data members refers to a class where such data members are first declared, not classes where they may be inherited, and extended this requirement to non-static bit fields. Also clarified that a standard-layout class "has at most one base class subobject of any given type." See DR1813, DR1881
- The definition of standard-layout has never allowed the type of any base class to be the same type as the first non-static data member. It is to avoid a situation where a data member at offset zero has the same type as any base class. The C++17 standard provides a more rigorous, recursive definition of "the set of the types of all non-base-class subobjects that are guaranteed in a standard-layout class to be at a zero offset" so as to prohibit such types from being the type of any base class. See DR1672, DR2120.
Note: The C++ standards committee intended the above changes based on defect reports to apply to C++14, though the new language is not in the published C++14 standard. It is in the C++17 standard.
POD in C++11 was basically split into two different axes here: triviality and layout. Triviality is about the relationship between an object's conceptual value and the bits of data within its storage. Layout is about... well, the layout of an object's subobjects. Only class types have layout, while all types have triviality relationships.
So here is what the triviality axis is about:
Non-trivially copyable: The value of objects of such types may be more than just the binary data that are stored directly within the object.
For example,
unique_ptr<T>
stores aT*
; that is the totality of the binary data within the object. But that's not the totality of the value of aunique_ptr<T>
. Aunique_ptr<T>
stores either anullptr
or a pointer to an object whose lifetime is managed by theunique_ptr<T>
instance. That management is part of the value of aunique_ptr<T>
. And that value is not part of the binary data of the object; it is created by the various member functions of that object.For example, to assign
nullptr
to aunique_ptr<T>
is to do more than just change the bits stored in the object. Such an assignment must destroy any object managed by theunique_ptr
. To manipulate the internal storage of aunique_ptr
without going through its member functions would damage this mechanism, to change its internalT*
without destroying the object it currently manages, would violate the conceptual value that the object possesses.Trivially copyable: The value of such objects are exactly and only the contents of their binary storage. This is what makes it reasonable to allow copying that binary storage to be equivalent to copying the object itself.
The specific rules that define trivial copyability (trivial destructor, trivial/deleted copy/move constructors/assignment) are what is required for a type to be binary-value-only. An object's destructor can participate in defining the "value" of an object, as in the case with
unique_ptr
. If that destructor is trivial, then it doesn't participate in defining the object's value.Specialized copy/move operations also can participate in an object's value.
unique_ptr
's move constructor modifies the source of the move operation by null-ing it out. This is what ensures that the value of aunique_ptr
is unique. Trivial copy/move operations mean that such object value shenanigans are not being played, so the object's value can only be the binary data it stores.Trivial: This object is considered to have a functional value for any bits that it stores. Trivially copyable defines the meaning of the data store of an object as being just that data. But such types can still control how data gets there (to some extent). Such a type can have default member initializers and/or a default constructor that ensures that a particular member always has a particular value. And thus, the conceptual value of the object can be restricted to a subset of the binary data that it could store.
Performing default initialization on a type that has a trivial default constructor will leave that object with completely uninitialized values. As such, a type with a trivial default constructor is logically valid with any binary data in its data storage.
The layout axis is really quite simple. Compilers are given a lot of leeway in deciding how the subobjects of a class are stored within the class's storage. However, there are some cases where this leeway is not necessary, and having more rigid ordering guarantees is useful.
Such types are standard layout types. And the C++ standard doesn't even really do much with saying what that layout is specifically. It basically says three things about standard layout types:
The first subobject is at the same address as the object itself.
You can use
offsetof
to get a byte offset from the outer object to one of its member subobjects.union
s get to play some games with accessing subobjects through an inactive member of a union if the active member is (at least partially) using the same layout as the inactive one being accessed.
Compilers generally permit standard layout objects to map to struct
types with the same members in C. But there is no statement of that in the C++ standard; that's just what compilers feel like doing.
POD is basically a useless term at this point. It is just the intersection of trivial copyability (the value is only its binary data) and standard layout (the order of its subobjects is more well-defined). One can infer from such things that the type is C-like and could map to similar C objects. But the standard has no statements to that effect.
can you please elaborate following rules:
I'll try:
a) standard-layout classes must have all non-static data members with the same access control
That's simple: all non-static data members must all be public
, private
, or protected
. You can't have some public
and some private
.
The reasoning for them goes to the reasoning for having a distinction between "standard layout" and "not standard layout" at all. Namely, to give the compiler the freedom to choose how to put things into memory. It's not just about vtable pointers.
Back when they standardized C++ in 98, they had to basically predict how people would implement it. While they had quite a bit of implementation experience with various flavors of C++, they weren't certain about things. So they decided to be cautious: give the compilers as much freedom as possible.
That's why the definition of POD in C++98 is so strict. It gave C++ compilers great latitude on member layout for most classes. Basically, POD types were intended to be special cases, something you specifically wrote for a reason.
When C++11 was being worked on, they had a lot more experience with compilers. And they realized that... C++ compiler writers are really lazy. They had all this freedom, but they didn't do anything with it.
The rules of standard layout are more or less codifying common practice: most compilers didn't really have to change much if anything at all to implement them (outside of maybe some stuff for the corresponding type traits).
Now, when it came to public
/private
, things are different. The freedom to reorder which members are public
vs. private
actually can matter to the compiler, particularly in debugging builds. And since the point of standard layout is that there is compatibility with other languages, you can't have the layout be different in debug vs. release.
Then there's the fact that it doesn't really hurt the user. If you're making an encapsulated class, odds are good that all of your data members will be private
anyway. You generally don't expose public data members on fully encapsulated types. So this would only be a problem for those few users who do want to do that, who want that division.
So it's no big loss.
b) only one class in the whole inheritance tree can have non-static data members,
The reason for this one comes back to why they standardized standard layout again: common practice.
There's no common practice when it comes to having two members of an inheritance tree that actually store things. Some put the base class before the derived, others do it the other way. Which way do you order the members if they come from two base classes? And so on. Compilers diverge greatly on these questions.
Also, thanks to the zero/one/infinity rule, once you say you can have two classes with members, you can say as many as you want. This requires adding a lot of layout rules for how to handle this. You have to say how multiple inheritance works, which classes put their data before other classes, etc. That's a lot of rules, for very little material gain.
You can't make everything that doesn't have virtual functions and a default constructor standard layout.
and the first non-static data member cannot be of a base class type (this could break aliasing rules).
I can't really speak to this one. I'm not educated enough in C++'s aliasing rules to really understand it. But it has something to do with the fact that the base member will share the same address as the base class itself. That is:
struct Base {};
struct Derived : Base { Base b; };
Derived d;
static_cast<Base*>(&d) == &d.b;
And that's probably against C++'s aliasing rules. In some way.
However, consider this: how useful could having the ability to do this ever actually be? Since only one class can have non-static data members, then Derived
must be that class (since it has a Base
as a member). So Base
must be empty (of data). And if Base
is empty, as well as a base class... why have a data member of it at all?
Since Base
is empty, it has no state. So any non-static member functions will do what they do based on their parameters, not their this
pointer.
So again: no big loss.
What changes in c++20
Following the rest of the clear theme of this question, the meaning and use of aggregates continues to change with every standard. There are several key changes on the horizon.
Types with user-declared constructors P1008
In C++17, this type is still an aggregate:
struct X {
X() = delete;
};
And hence, X{}
still compiles because that is aggregate initialization - not a constructor invocation. See also: When is a private constructor not a private constructor?
In C++20, the restriction will change from requiring:
no user-provided,
explicit
, or inherited constructors
to
no user-declared or inherited constructors
This has been adopted into the C++20 working draft. Neither the X
here nor the C
in the linked question will be aggregates in C++20.
This also makes for a yo-yo effect with the following example:
class A { protected: A() { }; };
struct B : A { B() = default; };
auto x = B{};
In C++11/14, B
was not an aggregate due to the base class, so B{}
performs value-initialization which calls B::B()
which calls A::A()
, at a point where it is accessible. This was well-formed.
In C++17, B
became an aggregate because base classes were allowed, which made B{}
aggregate-initialization. This requires copy-list-initializing an A
from {}
, but from outside the context of B
, where it is not accessible. In C++17, this is ill-formed (auto x = B();
would be fine though).
In C++20 now, because of the above rule change, B
once again ceases to be an aggregate (not because of the base class, but because of the user-declared default constructor - even though it's defaulted). So we're back to going through B
's constructor, and this snippet becomes well-formed.
Initializing aggregates from a parenthesized list of values P960
A common issue that comes up is wanting to use emplace()
-style constructors with aggregates:
struct X { int a, b; };
std::vector<X> xs;
xs.emplace_back(1, 2); // error
This does not work, because emplace
will try to effectively perform the initialization X(1, 2)
, which is not valid. The typical solution is to add a constructor to X
, but with this proposal (currently working its way through Core), aggregates will effectively have synthesized constructors which do the right thing - and behave like regular constructors. The above code will compile as-is in C++20.
Class Template Argument Deduction (CTAD) for Aggregates P1021 (specifically P1816)
In C++17, this does not compile:
template <typename T>
struct Point {
T x, y;
};
Point p{1, 2}; // error
Users would have to write their own deduction guide for all aggregate templates:
template <typename T> Point(T, T) -> Point<T>;
But as this is in some sense "the obvious thing" to do, and is basically just boilerplate, the language will do this for you. This example will compile in C++20 (without the need for the user-provided deduction guide).
精彩评论