Is it safe to allocate too little space (if you know you won't need it)?
So C99 blessed the commonly-used "flexible array member" hack to allow us to make struct
s that could be overallocated to suit our size requirements. I suspect it's perfectly safe on most sane implementations to do this, but is it legal in C to "underallocate" if we know in certain situations that we won't need some members of a struct
?
Abstract Example
Say I have a type:
struct a {
bool data_is_x;
void * data;
size_t pos;
};
If data_is_x
, then the type of data
is a type that needs to use the pos
member. Otherwise, the functions that work with this struct
won't need the pos
member for this particular copy of the struct
. Essentially, the struct
carries around information about whether or not it has a pos
member, and this information will not be changed within the struct
's lifetime (outside of evil mischief, which will break pretty much anything anyway). Is it safe to say:
struct a *a = malloc(data_is_x ? sizeof(struct a) : offsetof(struct a, pos));
which will allocate space for a pos
member only if one is needed? Or does it violate a constraint to use cast space that is too small to a struct
pointer, even when you never use the members in question?
Concrete Example
My real-world use case is a bit involved; it's here mainly so you can understand why I want to do this:
typedef struct {
size_t size;
void * data;
size_t pos;
} mylist;
The code for mylist_create
specifies that, for size > 0
, data
is an array of contiguous data that is size
items long (whatever an item may be), but that for size == 0
it is the current node of a doubly-linked list containing the items. All the functions that work with mylist
s will check whether size == 0
. If it does, they'll handle the data as a linked list with the "current" index being whichever node data
points to. If not, they'll handle the data as an array with the "current" index stored in pos
.
Now if size == 0
we don't really need the pos
member, but if size > 0
we will. So my question is, is it legal to do this:
mylist *list = malloc(size ? sizeof(mylist) : offsetof(mylist, pos));
If w开发者_开发百科e guarantee (on penalty of undefined behavior) that, while size == 0
, we will never try to (or need to) access the pos
member? Or does it say somewhere in the standard that it's UB to even think about doing this?
malloc
itself doesn't care at all how much memory you allocate for a structure, it's the dereferencing of memory outside the block that is undefined. From C99 6.5.3.2 Address and indirection operators
:
If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.
And, from 7.20.3 Memory management functions
, we find (my italics):
The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).
Hence, you can do something like:
typedef struct { char ch[100]; } ch100;
ch100 *c = malloc (1);
and, provided you only ever try to do anything with c->ch[0]
, it's perfectly acceptable.
For your specific concrete example, I'm not too sure I'd be that worried, assuming that what you're concerned about is storage space. If you're concerned for other reasons, feel free to ignore this bit, especially since the assumptions included within are not mandated by the standard.
From my understanding, you have a structure:
typedef struct {
size_t size;
void * data;
size_t pos;
} mylist;
where you want to use only data
where size
is 0, and both data
and pos
where size
is greater than 0. This precludes the use of putting data
and pos
in a union.
A significant number of malloc
implementations will round up your requested space to a multiple of 16 bytes (or some greater power of two) in order to ease memory fragmentation issues. This isn't required by the standard of course, but it is pretty common.
Assuming (for example) 32-bit pointers and size_t
, your twelve bytes of structure will most likely take up a 16-byte arena header and a 16-byte chunk for the data. This chunk would still be 16 bytes even if you only asked for 8 (ie. without pos
).
If you had 64-bit pointer and size_t
types, it might make a difference - 24 bytes with pos
and 16 without.
But even then, unless you're allocating a lot of these structures, it may not be a problem.
It is perfectly legal, but you should probably make it less obscure by having two structs, and when you read it:
struct leaf_node {
size_t size;
void *data;
size_t pos;
};
struct linked_node {
size_t size;
void *next;
};
void *in = ...;
if (*(size_t*)(in) == 0) {
struct leaf_node *node = in;
...
} else {
struct linked_node *node = in;
....
}
This goes more hand-in-hand with the standard that paxdiablo quoted, that you can cast the pointer to any data pointer. If you do it this way you will also always make sure you cast it to a struct that fits in the allocated buffer (an unnecessary but convenient feat).
What paxdiablo said about a minimum size of 16 bytes on 32-bit systems is often true, but you can still allocate big chunks to get around that.
On a 32-bit system the linked_node will use 8 bytes. You must use pools to benefit from what you are trying to do.
struct leaf_node *leaf_pool = malloc(N*sizeof(struct leaf_node));
struct linked_node *linked_pool = malloc(N*sizeof(struct linked_node));
Of course you should never realloc the pools, but allocate new pools as needed and reuse nodes. In this case a single leaf_node
will use 12 bytes.
Same applies to linked_node
, which will use 8 bytes instead of 16 bytes if you allocate them in a pool.
There will be no performance bottleneck as long as your structs aren't using __attribute__ ((packed))
in GCC, in which case your structs could get very badly aligned. Especially if you have, for example, an extra char in your struct giving it a size of 13 bytes.
Now if we go back to your initial question, the pointer you use to point to allocated data doesn't matter as long as you make sure you don't access data outside of the buffer. Your struct is essentially like a char string, and you check if the first size_t is a "null byte", if it is then assume the buffer is smaller. If it is not null, then the "string" is assumed to be longer and you read more data. The exact same risks are involved, and the only difference after compilation is the size per element read. There is nothing magic about using [el]
for strings compared to casting to a struct pointer and reading elements, as you can verify by just as easily causing a segfault using [el]
.
As far as I can tell, any member access is also an access to the aggregate and thus declares an effective type, ie we get an allocated object which is too small to actually contain a value of its type.
This smells like undefined behaviour, but I can't actually back that from the standard, and there are also reasonable arguments supporting the other interpretation.
You might think you are saving 4 or 8 bytes but your memory allocation can be aligned. If you are using gcc and its 16-byte aligned you can get something like this.
for (int i = 0; i <= 64; i++) {
char *p = (char *) malloc(i);
char *q = (char *) malloc(i);
long long t = q - p;
cout << "malloc(" << i << ") used " << t << " bytes " << endl;
}
prints
malloc(0) used 32 bytes
malloc(1) used 32 bytes
malloc(2) used 32 bytes
malloc(3) used 32 bytes
malloc(4) used 32 bytes
malloc(5) used 32 bytes
malloc(6) used 32 bytes
malloc(7) used 32 bytes
malloc(8) used 32 bytes
malloc(9) used 32 bytes
malloc(10) used 32 bytes
malloc(11) used 32 bytes
malloc(12) used 32 bytes
malloc(13) used 32 bytes
malloc(14) used 32 bytes
malloc(15) used 32 bytes
malloc(16) used 32 bytes
malloc(17) used 32 bytes
malloc(18) used 32 bytes
malloc(19) used 32 bytes
malloc(20) used 32 bytes
malloc(21) used 32 bytes
malloc(22) used 32 bytes
malloc(23) used 32 bytes
malloc(24) used 32 bytes
malloc(25) used 48 bytes
malloc(26) used 48 bytes
malloc(27) used 48 bytes
malloc(28) used 48 bytes
malloc(29) used 48 bytes
malloc(30) used 48 bytes
malloc(31) used 48 bytes
malloc(32) used 48 bytes
malloc(33) used 48 bytes
malloc(34) used 48 bytes
malloc(35) used 48 bytes
malloc(36) used 48 bytes
malloc(37) used 48 bytes
malloc(38) used 48 bytes
malloc(39) used 48 bytes
malloc(40) used 48 bytes
malloc(41) used 64 bytes
malloc(42) used 64 bytes
malloc(43) used 64 bytes
malloc(44) used 64 bytes
malloc(45) used 64 bytes
malloc(46) used 64 bytes
malloc(47) used 64 bytes
malloc(48) used 64 bytes
malloc(49) used 64 bytes
malloc(50) used 64 bytes
malloc(51) used 64 bytes
malloc(52) used 64 bytes
malloc(53) used 64 bytes
malloc(54) used 64 bytes
malloc(55) used 64 bytes
malloc(56) used 64 bytes
malloc(57) used 80 bytes
malloc(58) used 80 bytes
malloc(59) used 80 bytes
malloc(60) used 80 bytes
malloc(61) used 80 bytes
malloc(62) used 80 bytes
malloc(63) used 80 bytes
malloc(64) used 80 bytes
Depending on your system, it could be the case that whether you use malloc(0) or malloc(24) the same amount of memory is used.
The saving of 4 bytes in an allocation is almost meaningless unless you are talking about many tens of thousands of them, and in that case you are likely to want to use a pool allocation scheme with "freed" structs temporarily but on an "available" list (the "pool") instead of constantly freeing and reallocating them. I'll guarantee it will be faster. But to use such a scheme cleanly, all the re-usable parts need to be easily interchangeable -- that is, have the "size_t pos" member.
So, yes, what you're thinking of doing is perfectly legal; I am just not sure it is worth the complication and the lack of flexibility it imposes.
精彩评论