How do I allocate variably-sized structures contiguously in memory?
I'm using C++, and I have the following structures:
struct ArrayOfThese { int a; int b; }; struct DataPoint { int a; int b; int c; };
In memory, I want to have 1 or more ArrayOfThese elements at the end of each DataPoint. There are not always the same number of ArrayOfThese elements per DataPoint.
Because I have a ridiculous number of DataPoints to assemble and then stream across a network, I want all my DataPoints and their ArrayOfThese elements to be contiguous. Wasting space for a fixed number of the ArrayOfThese elements is unacceptable.
In C, I would have made an element at the end of DataPoint that was declared as ArrayOfThese d[0];
, allocated a DataPoint plus enough extra bytes for however many ArrayOfThese elements I had, and used the dummy array to index into them. (Of course, the number of ArrayOfThese elements would have开发者_Python百科 to be in a field of DataPoint.)
In C++, is using placement new and the same 0-length array hack the correct approach? If so, does placement new guarantee that subsequent calls to new from the same memory pool will allocate contiguously?
Since you are dealing with plain structures that have no constructors, you could revert to C memory management:
void *ptr = malloc(sizeof(DataPoint) + n * sizeof(ArrayOfThese));
DataPoint *dp = reinterpret_cast<DataPoint *>(ptr));
ArrayOfThese *aotp = reinterpet_cast<ArrayOfThese *>(reintepret_cast<char *>(ptr) + sizeof(DataPoint));
Since your structs are PODs you might as well do it just as you would in C. The only thing you'll need is a cast. Assuming n
is the number of things to allocate:
DataPoint *p=static_cast<DataPoint *>(malloc(sizeof(DataPoint)+n*sizeof(ArrayOfThese)));
Placement new does come into this sort of thing, if your objects have a a non-trivial constructor. It guarantees nothing about any allocations though, for it does no allocating itself and requires the memory to have been already allocated somehow. Instead, it treats the block of memory passed in as space for the as-yet-unconstructed object, then calls the right constructor to construct it. If you were to use it, the code might go like this. Assume DataPoint
has the ArrayOfThese arr[0]
member you suggest:
void *p=malloc(sizeof(DataPoint)+n*sizeof(ArrayOfThese));
DataPoint *dp=new(p) DataPoint;
for(size_t i=0;i<n;++i)
new(&dp->arr[i]) ArrayOfThese;
What gets constructed must get destructed so if you do this you should sort out the call of the destructor too.
(Personally I recommend using PODs in this sort of situation, because it removes any need to call constructors and destructors, but this sort of thing can be done reasonably safely if you are careful.)
As Adrian said in his answer, what you do in memory doesn't have to be the same as what you stream over the network. In fact, it might even be good to clearly divide this, because having a communication protocol relying on your data being designed in a specific way makes huge problem if you later need to refactor your data.
The C++ way to store an arbitrary number of elements contiguously is of course to std::vector
. Since you didn't even consider this, I assume that there's something that makes this undesirable. (Do you only have small numbers of ArrayOfThese
and fear the space overhead associated with std::vector
?)
While the trick with over-allocating a zero-length array probably isn't guaranteed to work and might, technically, invoke the dreaded undefined behavior, it's a widely spread one. What platform are you on? On Windows, this is done in the Windows API, so it's hard to imagine a vendor shipping a C++ compiler which wouldn't support this.
If there's a limited number of possible ArrayOfThese
element counts, you could also use fnieto's trick to specify those few numbers and then new
one of the resulting template instances, depending on the run-time number:
struct DataPoint {
int a;
int b;
int c;
};
template <std::size_t sz>
struct DataPointWithArray : DataPoint {
ArrayOfThese array[sz];
};
DataPoint* create(std::size_t n)
{
switch(n) {
case 1: return new DataPointWithArray[1];
case 2: return new DataPointWithArray[2];
case 5: return new DataPointWithArray[5];
case 7: return new DataPointWithArray[7];
case 27: return new DataPointWithArray[27];
default: assert(false);
}
return NULL;
}
Prior to C++0X, the language had no memory model to speak of. And with the new standard, I don't recall any talk of guarantees of contiguity.
Regarding this particular question, it sounds as if what you want is a pool allocator, many examples of which exist. Consider, for instance, Modern C++ Design, by Alexandrescu. The small object allocator discussion is what you should look at.
I think boost::variant
might accomplish this. I haven't had an opportunity to use it, but I believe it's a wrapper around unions, and so a std::vector
of them should be contiguous, but of course each item will take up the larger of the two sizes, you can't have a vector with differently-sized elements.
Take a look at the comparison of boost::variant and boost::any.
If you want the offset of each element to be dependent on the composition of the previous elements, you will have to write your own allocator and accessors.
Seems like it would be simpler to allocate an array of pointers and work with that rather than using placement new. That way you could just reallocate the whole array to the new size with little runtime cost. Also if you use placement new, you have to explicitly call destructors, which means mixing non-placement and placement in a single array is dangerous. Read http://www.parashift.com/c++-faq-lite/dtors.html before you do anything.
don't confuse data organisation inside your program and data organisation for serialization: they do not have the same goal.
for streaming across a network, you have to consider both side of the channel, the sending and the receiving side: how does the receiving side differentiate between a DataPoint and an ArrayOfThese ? how does the receiving side know how many ArrayOfThese are appended after a DataPoint ? (also to consider: what is the byte ordering of each side ? does data types have the same size in memory ?)
personally, i think you need a different structure for streaming your data, in which you add the number of DataPoint you are sending as well as the number of ArrayOfThese after each DataPoint. i would also not care about the way data is already organized in my program and reorganize/reformat to suit my protocol and not my program. after that writing a function for sending and another for receiving is not a big deal.
Why not have DataPoint contain a variable-length array of ArrayOfThese items? This will work in C or C++. There are some concerns if either struct contains non-primitive types
But use free() rather than delete on the result:
struct ArrayOfThese {
int a;
int b;
};
struct DataPoint {
int a;
int b;
int c;
int length;
ArrayOfThese those[0];
};
DataPoint* allocDP(int a, int b, int c, size_t length)
{
// There might be alignment issues, but not for most compilers:
size_t sz = sizeof(DataPoint) + length * sizeof(ArrayOfThese);
DataPoint dp = (DataPoint*)calloc( sz );
// (Check for out of memory)
dp->a = a; dp->b = b; tp->c = c; dp->length = length;
}
Then you can use it "normally" in a loop where the DataPoint knows its length:
DataPoint *dp = allocDP( 5, 8, 3, 20 );
for(int i=0; i < dp->length; ++i)
{
// Initialize or access: dp->those[i]
}
Could you make those into classes with the same superclass and then use your favourite stl container of choice, using the superclass as the template?
Two questions:
- Is the similarity between ArrayOfThese and DataPoint real, or a simplification for posting? I.e. is the real difference just one int (or some arbitrary number of the same type of items)?
- Is the number of ArrayOfThese associated with a particular DataPoint known at compile time?
If the first is true, I'd think hard about simply allocating an array of as many items as necessary for one DataPoint+N ArrayOfThese. I'd then build a quick bit of code to overload operator[] for that to return item N+3, and overload a(), b() and c() to return the first three items.
If the second is true, I was going to suggest essentially what I see fnieto has just posted, so I won't go into more detail.
As far as placement new goes, it doesn't really guarantee anything about allocation -- in fact, the whole idea about placement new is that it's completely unrelated to memory allocation. Rather, it allows you to create an object at an arbitrary address (subject to alignment restrictions) in a block of memory that's already allocated.
Here's the code I ended up writing:
#include <iostream>
#include <cstdlib>
#include <cassert>
using namespace std;
struct ArrayOfThese {
int e;
int f;
};
struct DataPoint {
int a;
int b;
int c;
int numDPars;
ArrayOfThese d[0];
DataPoint(int numDPars) : numDPars(numDPars) {}
DataPoint* next() {
return reinterpret_cast<DataPoint*>(reinterpret_cast<char*>(this) + sizeof(DataPoint) + numDPars * sizeof(ArrayOfThese));
}
const DataPoint* next() const {
return reinterpret_cast<const DataPoint*>(reinterpret_cast<const char*>(this) + sizeof(DataPoint) + numDPars * sizeof(ArrayOfThese));
}
};
int main() {
const size_t BUF_SIZE = 1024*1024*200;
char* const buffer = new char[BUF_SIZE];
char* bufPtr = buffer;
const int numDataPoints = 1024*1024*2;
for (int i = 0; i < numDataPoints; ++i) {
// This wouldn't really be random.
const int numArrayOfTheses = random() % 10 + 1;
DataPoint* dp = new(bufPtr) DataPoint(numArrayOfTheses);
// Here, do some stuff to fill in the fields.
dp->a = i;
bufPtr += sizeof(DataPoint) + numArrayOfTheses * sizeof(ArrayOfThese);
}
DataPoint* dp = reinterpret_cast<DataPoint*>(buffer);
for (int i = 0; i < numDataPoints; ++i) {
assert(dp->a == i);
dp = dp->next();
}
// Here, send it out.
delete[] buffer;
return 0;
}
精彩评论