C: Allocate memory ahead or on every demand?
I've got 开发者_JAVA技巧a simple (propably) question for you. When creating own dynamic array implementation in C, is it better to allocate memory ahead(let's presume, for 10 elements) when an array is close to reach its current maximum capability, or to realloc memory every time when element count changes?
I mean: for performance, elegance or whatever comes to your mind.
The usual choice is to multiple the current size by a fixed number greater than one (1.5ish is common, as is 2) which gets you amortized O(n) total allocation and copying costs.
Note that there is no guarantee that you can increase the size of the array in place, so you may have to copy all the existing elements to the new location (realloc()
does this automatically for you, but you still have to pay the cost).
It's much better for performance to realloc as few times as possible. So you can do something like:
array = realloc(array, old_size * 2);
It is usually better to pre-allocate at a "starting" fixed-size and when you run out of space reallocate based on a growth-factor.
Depending on your needs the starting size and growth factor can be determined based on typical usage of the data you're dealing with or the API you create for your data can allow the caller to specify the starting size and growth factor as part of an initialization/creation call.
The starting size should be a number based on typical usage. Typical usage is the important factor which helps you pick a size so that you: A) don't waste space by choosing a starting size that is 'too large' and B) don't use a starting size too small to where many reallocations will be necessary until the target typical size is reached.
Typical size, of course, is a magic number. A way to determine typical size is to run some tests using various data sets and collect stats for starting size, number of reallocations and min/max memory usage for your data. You can average the results to get a usable typical starting size.
As for growth factor, a x1.5 or x2 growth factor is common. This is something you can gauge using test stats as with the starting size.
Another thing to consider is that you'll need to be careful of managing references to dynamically resizable data, as a realloc() will relocate the data in memory when needed. This means if you stored the address of the first element of the dynamically resizable array that address may be invalid after a call to realloc. This can be managed by an API wrapper around your custom data type that hands out indexes instead of memory addresses and has a way to resolve the index to the current address of the element when needed.
Allocating memory on heap is always costly. Doing it element-by-element is not at all suggestable.
You can define a structure like this:
/** Dynamic array */
typedef struct __darray {
int* array; /** Array */
int size; /** Array size */
int cap; /** Capacity */
} darray;
Size <= capacity .
When you dynamically add a new element test if your size > capacity. If true perform a memory reallocation. The formula (taken from the ArrayList implementantion from JDK) is:
(jobs here...)
[your_darray]->capacity+=[your_darray]->capacity*3/2+1;
[your_darray]->array=(int*)realloc([your_darray]->array,capacity*sizeof(int));
(jobs here...)
If you remove (enough) elements from you dynamic array, don't forget to shrink again the "int*".
If this dynamic array implemention is intented to be used in lots of contexts, then it's better to give control of the predictive extra allocation to the user rather than embeding it in the library.
A more recent memory allocation strategy has been the Slab Allocator adopted by Solaris and Linux. Memcached also uses a slab allocator as documented by this FAQ entry.
精彩评论