开发者

Allocate chunk of memory for array of structs

I need an array of this struct allocated in one solid chunk of memory. The length of "char *extension" and "char *type" are not known at compile time.

stru开发者_开发百科ct MIMETYPE
{
 char *extension;
 char *type;
};

If I used the "new" operator to initialize each element by itself, the memory may be scattered. This is how I tried to allocate a single contiguous block of memory for it:

//numTypes = total elements of array
//maxExtension and maxType are the needed lengths for the (char*) in the struct
//std::string ext, type;
unsigned int size = (maxExtension+1 + maxType+1) * numTypes;
mimeTypes = (MIMETYPE*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size);

But, when I try to load the data in like this, the data is all out of order and scattered when I try to access it later.

for(unsigned int i = 0; i < numTypes; i++)
{
 //get data from file
 getline(fin, line);
 stringstream parser.str(line);
 parser >> ext >> type;

 //point the pointers at a spot in the memory that I allocated
 mimeTypes[i].extension = (char*)(&mimeTypes[i]);
 mimeTypes[i].type = (char*)((&mimeTypes[i]) + maxExtension);

 //copy the data into the elements
 strcpy(mimeTypes[i].extension, ext.c_str());
 strcpy(mimeTypes[i].type, type.c_str());
}

can anyone help me out?

EDIT:

unsigned int size = (maxExtension+1 + maxType+1);
mimeTypes = (MIMETYPE*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size * numTypes);

for(unsigned int i = 0; i < numTypes; i++)
    strcpy((char*)(mimeTypes + (i*size)), ext.c_str());
    strcpy((char*)(mimeTypes + (i*size) + (maxExtension+1)), type.c_str());


You mix 2 allocation:

1) manage array of MIMETYPE and

2) manage array of characters

May be (I don't really understand your objectives):

struct MIMETYPE
{
    char extension[const_ofmaxExtension];
    char type[maxType];
};

would be better to allocate linear items in form:

new MIMETYPE[numTypes];


I'll put aside the point that this is premature optimization (and that you ought to just use std::string, std::vector, etc), since others have already stated that.

The fundamental problem I'm seeing is that you're using the same memory for both the MIMETYPE structs and the strings that they'll point to. No matter how you allocate it, a pointer itself and the data it points to cannot occupy the exact same place in memory.


Lets say you needed an array of 3 types and had MIMETYPE* mimeTypes pointing to the memory you allocated for them.

That means you're treating that memory as if it contains:

8 bytes: mime type 0
8 bytes: mime type 1
8 bytes: mime type 2

Now, consider what you're doing in this next line of code:

mimeTypes[i].extension = (char*)(&mimeTypes[i]);

extension is being set to point to the same location in memory as the MIMETYPE struct itself. That is not going to work. When subsequent code writes to the location that extension points to, it overwrites the MIMETYPE structs.

Similarly, this code:

strcpy((char*)(mimeTypes + (i*size)), ext.c_str());

is writing the string data in the same memory that you otherwise want to MIMETYPE structs to occupy.


If you really want store all the necessary memory in one contiguous space, then doing so is a bit more complicated. You would need to allocate a block of memory to contain the MIMETYPE array at the start of it, and then the string data afterwards.

As an example, lets say you need 3 types. Lets also say the max length for an extension string (maxExtension) is 3 and the max length for a type string (maxType) is 10. In this case, your block of memory needs to be laid out as:

8 bytes: mime type 0
8 bytes: mime type 1
8 bytes: mime type 2
4 bytes: extension string 0
11 bytes: type string 0
4 bytes: extension string 1
11 bytes: type string 1
4 bytes: extension string 2
11 bytes: type string 2

So to allocate, setup, and fill it all correctly you would want to do something like:

unsigned int mimeTypeStringsSize = (maxExtension+1 + maxType+1);
unsigned int totalSize = (sizeof(MIMETYPE) + mimeTypeStringsSize) * numTypes;
char* data = (char*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, totalSize);

MIMETYPE* mimeTypes = (MIMETYPE*)data;
char* stringData = data + (sizeof(MIMETYPE) * numTypes);

for(unsigned int i = 0; i < numTypes; i++)
{
    //get data from file
    getline(fin, line);
    stringstream parser.str(line);
    parser >> ext >> type;

    // set pointers to proper locations
    mimeTypes[i].extension = stringData + (mimeTypeStringsSize * i);
    mimeTypes[i].type = stringData + (mimeTypeStringsSize * i) + maxExtension+1;

    //copy the data into the elements
    strcpy(mimeTypes[i].extension, ext.c_str());
    strcpy(mimeTypes[i].type, type.c_str());
}

(Note: I've based my byte layout explanations on typical behavior of 32-bit code. 64-bit code would have more space used for the pointers, but the principle is the same. Furthermore, the actual code I've written here should work regardless of 32/64-bit differences.)


What you need to do is get a garbage collector and manage the heap. A simple collector using RAII for object destruction is not that difficult to write. That way, you can simply allocate off the collector and know that it's going to be contiguous. However, you should really, REALLY profile before determining that this is a serious problem for you. When that happens, you can typedef many std types like string and stringstream to use your custom allocator, meaning that you can go back to just std::string instead of the C-style string horrors you have there.


You really have to know the length of extension and type in order to allocate MIMETYPEs contiguously (if "contiguously" means that extension and type are actually allocated within the object). Since you say that the length of extension and type are not known at compile time, you cannot do this in an array or a vector (the overall length of a vector can be set and changed at runtime, but the size of the individual elements must be known at compile time, and you can't know that size without knowing the length of extension and type).

I would personally recommend using a vector of MIMETYPEs, and making the extension and type fields both strings. You're requirements sound suspiciously like premature optimization guided by a gut feeling that dereferencing pointers is slow, especially if the pointers cause cache misses. I wouldn't worry about that until you have actual data that reading these fields is an actual bottleneck.

However, I can think of a possible "solution": you can allocate the extension and type strings inside the MIMETYPE object when they are shorter than a particular threshold and allocate them dynamically otherwise:

#include <algorithm>
#include <cstring>
#include <new>

template<size_t Threshold> class Kinda_contig_string {
    char contiguous_buffer[Threshold];
    char* value;

    public:

    Kinda_contig_string() : value(NULL) { }

    Kinda_contig_string(const char* s)
    {
         size_t length = std::strlen(s);
         if (s < Threshold) {
             value = contiguous_buffer;
         }
         else {
             value = new char[length];
         }

         std::strcpy(value, s);
    }

    void set(const char* s)
    {
        size_t length = std::strlen(s);

        if (length < Threshold && value == contiguous_buffer) {
            // simple case, both old and new string fit in contiguous_buffer
            // and value points to contiguous_buffer
            std::strcpy(contiguous_buffer, s);
            return;
        }

        if (length >= Threshold && value == contiguous_buffer) {
            // old string fit in contiguous_buffer, new string does not
            value = new char[length];
            std::strcpy(value, s);
            return;
        }

        if (length < Threshold && value != contiguous_buffer) {
            // old string did not fit in contiguous_buffer, but new string does
            std::strcpy(contiguous_buffer, s);
            delete[] value;
            value = contiguous_buffer;
            return;
        }

        // old and new strings both too long to fit in extension_buffer
        // provide strong exception guarantee
        char* temp_buffer = new char[length];
        std::strcpy(temp_buffer, s);
        std::swap(temp_buffer, value);
        delete[] temp_buffer;
        return;
    }

    const char* get() const
    {
        return value;
    }
}

class MIMETYPE {
    Kinda_contig_string<16> extension;
    Kinda_contig_string<64> type;

  public:
    const char* get_extension() const
    {
        return extension.get();
    }

    const char* get_type() const
    {
        return type.get();
    }

    void set_extension(const char* e)
    {
        extension.set(e);
    }

    // t must be NULL terminated
    void set_type(const char* t)
    {
        type.set(t);
    }

    MIMETYPE() : extension(), type() { }

    MIMETYPE(const char* e, const char* t) : extension(e), type(t) { }
};

I really can't endorse this without feeling guilty.


Add one byte in between strings... extension and type are not \0-terminated the way do it.

here you allocate allowing for an extra \0 - OK

unsigned int size = (maxExtension+1 + maxType+1) * numTypes;
mimeTypes = (MIMETYPE*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size);

here you don't leave any room for extension's ending \0 (if string len == maxExtension)

 //point the pointers at a spot in the memory that I allocated
 mimeTypes[i].extension = (char*)(&mimeTypes[i]);
 mimeTypes[i].type = (char*)((&mimeTypes[i]) + maxExtension);

instead i think it should be

 mimeTypes[i].type = (char*)((&mimeTypes[i]) + maxExtension + 1);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜