开发者

Don't understand the bit field description from the C standard

"Whether a field may overlap a word boundary is implementation-defined. Fields need not be named; unnamed fields (a colon and width only) are used for padding. The special width 0 may be used to force alignment at the next word boundary."

- The C programming Language [2e] by Kernighan & Ritchie [Section 6.9, p.150]

I'm unable to get these lines. Can you pleas开发者_运维百科e explain?


"whether a field may overlap a word bounday is implementation - defined.

Consider two words of memory, where the word size is say 32 bits:

[31] [30] [29] ... [2] [1] [0] | [31] [30] [29] ... [2] [1] [0]

If we had a struct:

struct X
{
    int a : 30;
    int b : 4;
};

Then a compiler might choose to put field b so part is in each word, or it might leave a gap so that all of b falls inside the second word:

[31] [30] [29] ... [2] [1] [0] | [31] [30] [29] [28] ... [2] [1] [0]
a--------------------a b-----------------b
OR
a--------------------a    GAP    b-----------------b

Why might it leave a GAP? Because then when it wants to read or write b, it only needs to work with one word in memory - that's typically faster and simpler, needing less CPU instructions.

fields need not be named; unnamed fields ( a colon and width only) are used for padding.

If we changed our earlier struct, we could explicitly ask for a gap:

struct X
{
    int a : 30;
    int   : 2;  // unnamed field
    int b : 4;
};

This is saying "leave 2 bits between a and b - they don't need an identifier (name) because I'll never ask what's in them, or need to ask their value be changed". But, you don't have to make it 2 just so that 30 + 2 == 32 (our word size)... you can ask for whatever gaps you like whereever you like. This might be useful if you're dealing with values from some hardware device, and you knew what some of the bits were but not others, or you just didn't need to use some of them - you can just leave them unnamed to document your disinterest while still having the compiler space the named bit fields at the required offsets into the word necessary to correspond to the hardware's usage.

the special width 0 may be used to force alignment at the next word boundary."

This just means the compiler can calculate how many bits are left in the partially-filled word, and skip to the start of the next word. Just as we ensured b started in a new word by adding a 2 bit field above (given our knowledge that a was 30 bits and the word size was 32), we could have...

struct X
{
    int a : 30;
    int   : 0;  // unnamed field
    int b : 4;
};

...and the compiler would work out the 2 for us. This way, if we change a to be some other size, or end up compiling for a 64-bit word size, the compiler will silently adjust to appropriate behaviour without needing the unnamed field to be manually corrected.


Basically, if an address is "word aligned", the processor can perform operations faster. A word is 32 bits generally, or 4 bytes.

Typical processors are "word" aligned, meaning that they can retrieve an entire "word" of memory in one operation. When a value is across multiple values, the processor must perform multiple operations to get the same data. Sometimes, this is unavoidable for instance if you are using a "double word" but if you have a single word that stretches across a word boundary, the CPU will have to perform 2 operations to retrieve the single word of data.

An example of a word aligned value is 0x10000004, 0x10000008. Since a word is 4 bytes, the address must be a divisible by 4. A non word aligned value is 0x10000003.

To the programmer, all operations will work as expected, but under the hood, the CPU must perform 1 memory operation to read or write to 0x10000004 whereas it must perform 2 memory operations to read or write to 0x10000003, since it crosses a word boundary.

In reference to your original question, this is basically saying that depending on the compiler you use, the compiler may or may not word align your fields. This is an example of size vs speed, as you can pack more data if you don't word align, but as shown above, it will be slower.


First it mostly has to do with memory 'aligment'. Compilers often align variables or fields on word boundaries, a word is 32bits on a 32 bit platform. This means that two bools will bethe first byte in different words, rather than two consecutive bytes.

Bit fields can force a layout in memory: you can be sure a specific field only uses 3 bits if its values range from 0-7.

A field can be unnamed. You don't need to name fields if you're not going to use it. This may be used to force a specific layout.

If you use :0, it will auto align on the next word boundary.

In general, you don't need this behavior unless you're tuning performance in some way.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜