Memory alignment in C-structs
I'm working on a 32-bit machine, so I suppose that the memory alignment should be 4 bytes. Say I have this struct:
typedef struct {
unsigned short v1;开发者_开发问答
unsigned short v2;
unsigned short v3;
} myStruct;
The plain added size is 6 bytes, and I suppose that the aligned size should be 8, but sizeof(myStruct)
returns me 6.
However if I write:
typedef struct {
unsigned short v1;
unsigned short v2;
unsigned short v3;
int i;
} myStruct;
the plain added size is 10 bytes, aligned size shall be 12, and this time sizeof(myStruct) == 12
.
Can somebody explain what is the difference?
At least on most machines, a type is only ever aligned to a boundary as large as the type itself [Edit: you can't really demand any "more" alignment than that, because you have to be able to create arrays, and you can't insert padding into an array]. On your implementation, short
is apparently 2 bytes, and int
4 bytes.
That means your first struct is aligned to a 2-byte boundary. Since all the members are 2 bytes apiece, no padding is inserted between them.
The second contains a 4-byte item, which gets aligned to a 4-byte boundary. Since it's preceded by 6 bytes, 2 bytes of padding is inserted between v3
and i
, giving 6 bytes of data in the short
s, two bytes of padding, and 4 more bytes of data in the int
for a total of 12.
Forget about having different members, even if you write two structs whose members are exactly same, with a difference is that the order in which they're declared is different, then size of each struct can be (and often is) different.
For example, see this,
#include <iostream>
using namespace std;
struct A
{
char c;
char d;
int i;
};
struct B
{
char c;
int i; //note the order is different!
char d;
};
int main() {
cout << sizeof(A) << endl;
cout << sizeof(B) << endl;
}
Compile it with gcc-4.3.4
, and you get this output:
8
12
That is, sizes are different even though both structs has same members!
Code at Ideone : http://ideone.com/HGGVl
The bottomline is that the Standard doesn't talk about how padding should be done, and so the compilers are free to make any decision and you cannot assume all compilers make the same decision.
By default, values are aligned according to their size. So a 2-byte value like a short
is aligned on a 2-byte boundary, and a 4-byte value like an int
is aligned on a 4-byte boundary
In your example, 2 bytes of padding are added before i
to ensure that i
falls on a 4-byte boundary.
(The entire structure is aligned on a boundary at least as big as the biggest value in the structure, so your structure will be aligned to a 4-byte boundary.)
The actual rules vary according to the platform - the Wikipedia page on Data structure alignment has more details.
Compilers typically let you control the packing via (for example) #pragma pack
directives.
Assuming:
sizeof(unsigned short) == 2
sizeof(int) == 4
Then I personally would use the following (your compiler may differ):
unsigned shorts are aligned to 2 byte boundaries
int will be aligned to 4 byte boundaries.
typedef struct
{
unsigned short v1; // 0 bytes offset
unsigned short v2; // 2 bytes offset
unsigned short v3; // 4 bytes offset
} myStruct; // End 6 bytes.
// No part is required to align tighter than 2 bytes.
// So whole structure can be 2 byte aligned.
typedef struct
{
unsigned short v1; // 0 bytes offset
unsigned short v2; // 2 bytes offset
unsigned short v3; // 4 bytes offset
/// Padding // 6-7 padding (so i is 4 byte aligned)
int i; // 8 bytes offset
} myStruct; // End 12 bytes
// Whole structure needs to be 4 byte aligned.
// So that i is correctly aligned.
Firstly, while the specifics of padding are left up to the compiler, the OS also imposes some rules as to alignment requirements. This answer assumes that you are using gcc, though the OS may vary
To determine the space occupied by a given struct and its elements, you can follow these rules:
First, assume that the struct always starts at an address that is properly aligned for all data types.
Then for every entry in the struct:
- The minimum space needed is the raw size of the element given by
sizeof(element)
. - The alignment requirement of the element is the alignment requirement of the element's base type.
Notably, this means that the alignment requirement for a
char[20]
array is the same as the requirement for a plainchar
.
Finally, the alignment requirement of the struct as a whole is the maximum of the alignment requirements of each of its elements.
gcc will insert padding after a given element to ensure that the next one (or the struct if we are talking about the last element) is correctly aligned. It will never rearrange the order of the elements in the struct, even if that will save memory.
Now the alignment requirements themselves are also a bit odd.
- 32-bit Linux requires that 2-byte data types have 2-byte alignment (their addresses must be even). All larger data types must have 4-byte alignment (addresses ending in
0x0
,0x4
,0x8
or0xC
). Note that this applies to types larger than 4 bytes as well (such asdouble
andlong double
). - 32-bit Windows is more strict in that if a type is K bytes in size, it must be K byte aligned. This means that a
double
can only placed at an address ending in0x0
or0x8
. The only exception to this is thelong double
which is still 4-byte aligned even though it is actually 12-bytes long. - For both Linux and Windows, on 64-bit machines, a K byte type must be K byte aligned. Again, the
long double
is an exception and must be 16-byte aligned.
Each data type needs to be aligned on a memory boundary of its own size. So a short
needs to be on aligned on a 2-byte boundary, and an int
needs to be on a 4-byte boundary. Similarly, a long long
would need to be on an 8-byte boundary.
The reason for the second sizeof(myStruct)
being 12
is the padding that gets inserted between v3
and i
to align i
at a 32-bit boundary. There is two bytes of it.
Wikipedia explains the padding and alignment reasonably clearly.
In your first struct, since every item is of size short
, the whole struct can be aligned on short
boundaries, so it doesn't need to add any padding at the end.
In the second struct, the int (presumably 32 bits) needs to be word aligned so it inserts padding between v3
and i
to align i
.
Sounds like its being aligned to bounderies based on the size of each var, so that the address is a multiple of the size being accessed(so shorts are aligned to 2, ints aligned to 4 etc), if you moved one of the shorts after the int, sizeof(mystruct)
should be 10. Of course this all depends on the compiler being used and what settings its using in turn.
The standard doesn't say much about the layout of structs with complete types - it's up to to the compiler. It decided that it needs the int to start on a boundary to access it, but since it has to do sub-boundary memory addressing for the shorts there is no need to pad them
精彩评论