Specifying size of enum type in C
Already read through this related question, but was looking for something a little more specific.
- Is there a way to tell your compiler specifically how wide you want your enum to be?
- If so, how do you do it? I know how to specify it in C#; is it similarly done in C?
- Would it even be worth doing? When the enum value is passed to a function, will it be passed as an
int
-sized value开发者_开发问答 regardless?
I believe there is a flag if you are using GCC.
-fshort-enums
Is there a way to tell your compiler specifically how wide you want your enum to be?
In general case no. Not in standard C.
Would it even be worth doing?
It depends on the context. If you are talking about passing parameters to functions, then no, it is not worth doing (see below). If it is about saving memory when building aggregates from enum types, then it might be worth doing. However, in C you can simply use a suitably-sized integer type instead of enum type in aggregates. In C (as opposed to C++) enum types and integer types are almost always interchangeable.
When the enum value is passed to a function, will it be passed as an int-sized value regardless?
Many (most) compilers these days pass all parameters as values of natural word size for the given hardware platform. For example, on a 64-bit platform many compilers will pass all parameters as 64-bit values, regardless of their actual size, even if type int
has 32 bits in it on that platform (so, it is not generally passed as "int-sized" value on such a platform). For this reason, it makes no sense to try to optimize enum sizes for parameter passing purposes.
You can force it to be at least a certain size by defining an appropriate value. For example, if you want your enum to be stored as the same size as an int
, even though all the values would fit in a char
, you can do something like this:
typedef enum {
firstValue = 1,
secondValue = 2,
Internal_ForceMyEnumIntSize = MAX_INT
} MyEnum;
Note, however, that the behavior can be dependent on the implementation.
As you note, passing such a value to a function will cause it to be expanded to an int anyway, but if you are using your type in an array or a struct, then the size will matter. If you really care about element sizes, you should really use types like int8_t
, int32_t
, etc.
Even if you are writing strict C
code, the results are going to be compiler dependent. Employing the strategies from this thread, I got some interesting results...
enum_size.c
#include <stdio.h>
enum __attribute__((__packed__)) PackedFlags {
PACKED = 0b00000001,
};
enum UnpackedFlags {
UNPACKED = 0b00000001,
};
int main (int argc, char * argv[]) {
printf("packed:\t\t%lu\n", sizeof(PACKED));
printf("unpacked:\t%lu\n", sizeof(UNPACKED));
return 0;
}
$ gcc enum_size.c
$ ./a.out
packed: 4
unpacked: 4
$ gcc enum_size.c -fshort_enums
$ ./a.out
packed: 4
unpacked: 4
$ g++ enum_size.c
$ ./a.out
packed: 1
unpacked: 4
$ g++ enum_size.c -fshort_enums
$ ./a.out
packed: 1
unpacked: 1
In my example above, I did not realize any benefit from __attribute__((__packed__))
modifier until I started using the C++ compiler.
EDIT:
@technosaurus's suspicion was correct.
By checking the size of sizeof(enum PackedFlags)
instead of sizeof(PACKED)
I see the results I had expected...
printf("packed:\t\t%lu\n", sizeof(enum PackedFlags));
printf("unpacked:\t%lu\n", sizeof(enum UnpackedFlags));
I now see the expected results from gcc
:
$ gcc enum_size.c
$ ./a.out
packed: 1
unpacked: 4
$ gcc enum_size.c -fshort_enums
$ ./a.out
packed: 1
unpacked: 1
There is also another way if the enum is part of a structure:
enum whatever { a,b,c,d };
struct something {
char :0;
enum whatever field:CHAR_BIT;
char :0;
};
The :0; can be omitted if the enum field is surrounded by normal fields. If there's another bitfield before, the :0 will force byte alignement to the next byte for the field following it.
In some circumstances, this may be helpful:
typedef uint8_t command_t;
enum command_enum
{
CMD_IDENT = 0x00, //!< Identify command
CMD_SCENE_0 = 0x10, //!< Recall Scene 0 command
CMD_SCENE_1 = 0x11, //!< Recall Scene 1 command
CMD_SCENE_2 = 0x12, //!< Recall Scene 2 command
};
/* cmdVariable is of size 8 */
command_t cmdVariable = CMD_IDENT;
On one hand type command_t
has size 1 (8bits) and can be used for variable and function parameter type.
On the other hand you can use the enum values for assignation that are of type int
by default but the compiler will cast them immediately when assigned to a command_t
type variable.
Also, if you do something unsafe like defining and using a CMD_16bit = 0xFFFF,
the compiler will warn you with following message:
warning: large integer implicitly truncated to unsigned type [-Woverflow]
As @Nyx0uf says, GCC has a flag which you can set:
-fshort-enums
Allocate to an enum type only as many bytes as it needs for the declared range of possible values. Specifically, the enum type is equivalent to the smallest integer type that has enough room.
Warning: the
-fshort-enums
switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface.
Source: https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html
Additional great reading for general insight: https://www.embedded.fm/blog/2016/6/28/how-big-is-an-enum.
Interesting...notice the line I highlighted in yellow below! Adding an enum entry called ARM_EXCEPTION_MAKE_ENUM_32_BIT
and with a value equal to 0xffffffff
, which is the equivalent of UINT32_MAX
from stdint.h
(see here and here), forces this particular Arm_symbolic_exception_name
enum to have an integer type of uint32_t
. That is the sole purpose of this ARM_EXCEPTION_MAKE_ENUM_32_BIT
entry! It works because uint32_t
is the smallest integer type which can contain all of the enum values in this enum--namely: 0
through 8
, inclusive, as well as 0xffffffff
, or decimal 2^32-1
= 4294967295
.
Keywords: ARM_EXCEPTION_MAKE_ENUM_32_BIT enum purpose why have it? Arm_symbolic_exception_name purpose of 0xffffffff enum entry at end.
Right now I can't answer your first two questions, because I am trying to find a good way to do this myself. Maybe I will edit this if I find a strategy that I like. It isn't intuitive though.
But I want to point something out that hasn't been mentioned so far, and to do so I will answer the third question like so:
It is "worth doing" when writing a C API that will be called from languages that aren't C. Anything that directly links to the C code will need to correctly understand the memory layout of all structs, parameter lists, etc in the C code's API. Unfortunately, C types like int
, or worst yet, enums, are fairly unpredictably sized (changes by compiler, platform, etc), so knowing the memory layout of anything containing an enum can be dodgy unless your other programming language's compiler is also the C compiler AND it has some in-language mechanism to exploit that knowledge. It is much easier to write problem-free bindings to C libraries when the API uses predictably-sized C types like uint8_t
, uint16_t
, uint32_t
, uint64_t
, void*
, uintptr_t
, etc, and structs/unions composed of those predictably-sized types.
So I would care about enum sizing when it matters for program correctness, such as when memory layout and alignment issues are possible. But I wouldn't worry about it so much for optimization, not unless you have some niche situation that amplifies the opportunity cost (ex: a large array/list of enum-typed values on a memory constrained system like a small MCU).
Unfortunately, situations like what I'm mentioning are not helped by something like -fshort-enums
, because this feature is vendor-specific and less predictable (e.g. another system would have to "guess" enum size by approximating GCC's algorithm for -fshort-enums
enum sizing). If anything, it would allow people to compile C code in a way that would break common assumptions made by bindings in other languages (or other C code that wasn't compiled with the same option), with the expected result being memory corruption as parameters or struct members get written to, or read from, the wrong locations in memory.
As of C23, this is finally possible in standard C:
You can put a colon and an integer type after the enum
keyword (or after the name tag, if it's named) to specify the enum's fixed underyling type, which sets the size and range of the enum type.
Would it even be worth doing? When the enum value is passed to a function, will it be passed as an int-sized value regardless?
On x86_64, the type of a integer does not influence whether it is passed in register or not (as long as it fits in a single register). The size of data on the heap however is very significant for cache performance.
It depends on the values assigned for the enums.
Ex: If the value greater than 2^32-1 is stored, the size allocated for the overall enum will change to the next size.
Store 0xFFFFFFFFFFFF value to a enum variable, it will give warning if tried to compile in a 32 bit environment (round off warning) Where as in a 64 bit compilation, it will be successful and the size allocated will be 8 bytes.
精彩评论