开发者

C++ Vectors of Objects and Pointers

This is a contrived example that illustrates a problem I've encountered. Basically, I create a vector of objects, then a vector of pointers to the objects, then print the pointers and the dereferenced objects.

#include <vector>
#include <iostream>

using namespace std;

namespace {
    struct MyClass {
        int* MyInt;
        MyClass(int* i) : MyInt(i) {}
    };

    struct MyBigClass {
        vector<MyClass> AllMyClassRecords;  // Where I keep the MyClass instances
        vector<int> TheInts;

        void loadMyClasses();
        void readMyClasses();
        MyBigClass() {}
    };

}

void MyBigClass::loadMyClasses() {
    for (int i = 0; i < 10; ++i) {
        TheInts.push_back(i);   // Create an int
        int *j = &TheInts[TheInts.size() - 1];  // Create a pointer to the new int
        AllMyClassRecords.push_back(MyClass(j));    // Create a MyClass using pointer
    }
}

void MyBigClass::readMyClasses() {
    for (vector<MyClass>::iterator it = AllMyClassRecords.begin();
            it != AllMyClassRecords.end(); ++it)
        cout << it->MyInt << " => " << *(it->MyInt) << endl;
}

int main() {
    MyBigClass MBC;

    MBC.loadMyClasses();
    MBC.readMyClasses();
}

Basically, I want to create a vector of pointers to another vector of ints. The problem is that this code prints out the following:

0x97ea008 => 159293472
0x97ea02c => 1
0x97ea040 => 2
0x97ea044 => 3
0x97ea078 => 4
0x97ea07c => 5
0x97ea080 => 6
0x97ea084 => 7
0x97ea0d8 => 8
0x97ea0dc => 9

It appears to work as expected except for the f开发者_开发百科irst value, which is likely some garbage in memory. Why is only the first value affected? If my code is broken, why is it only broken for the first pointer inserted?


Update: I'm compiling this using g++ on Ubuntu. As far as specifically what I'm doing, I'm creating a compiler analysis pass. The MyClass objects hold information about instructions, which I want to update when I locate certain registers. The register number indexes a vector of vectors, so a particular register number will have a vector of MyClass*s. Thus, if a register is found, any MyClass pointers in the vector will be used to update the MyClass object held in the separate MyClass vector. Because I'm accumulating both instruction info stored in MyClass objects and register info which must follow MyClass pointers, I can't create the entire MyClass vector first without creating a separate pass, which I'd like to avoid.


Update2: Now with pictures...

Pass Progress           inserts...   InstRecs (TheInt)  and updates...  UpdatePtrs (MyClass) 
----------------------              ------------------                  -----------------------
| => I1: Uses r0, r1 |              | InstRec for I1 |                  | r0: InstRec for I1* |
|    I2: Uses r0, r2 |              ------------------                  | r1: InstRec for I1* |
----------------------                                                  -----------------------

First the pass inserts an InstRec with info about I1. It also creates pointers to this new InstRec indexed by register number. r0 here is actually a vector of one element that points to the InstRec for I1, so that if r0 is ever encountered again in a subsequent instruction, InstRec for I1 will be updated.

Pass Progress           inserts...   InstRecs (TheInt)  and updates...  UpdatePtrs (MyClass) 
----------------------              ------------------                  -----------------------
|    I1: Uses r0, r1 |              | InstRec for I1 |                  | r0: InstRec for I1* |
| => I2: Uses r0, r2 |              | InstRec for I2 |                  |     InstRec for I2* |
----------------------              ------------------                  | r1: InstRec for I1* |
                                                                        | r2: InstRec for I2* |
                                                                        -----------------------

Similarly, the second entry will be inserted into InstRecs and pointers will be added to the UpdatePtrs structure. Since I2 uses r0, another InstRec pointer is pushed to the r0 vector. Not shown is this: when it is detected that I2 uses r0, the pass looks in the UpdatePtrs structure at the r0 vector of pointers, follows each pointer to their InstRec entry, and updates the InstRec with new info.

Hopefully that makes what I'm trying to do a little bit clearer. I've implemented the suggestion first proposed by @MerickOWA of using InstRec vector indices rather than InstRec pointers (since once the InstRecs are added to the array, they never move), and it seems to be working now.


What you are doing is very similar to creating something like this:

vector<int> MyInts;
vector< vector<int>::iterator > MyIntIters;

And then every time you add a new int to MyInts, you get the iterator and push that iterator in to MyIntIters.

You can't do this with a vector because the iterators could become invalid anytime you add a new int to MyInts.

So your whole structure is broken. You need to come up with an entirely new design. First I'd ask you why you want a vector of iterators (or pointers) to another vector. Is it for sorting? Indexing somehow? Something else? there's certainly a better way to do whatever it is you're trying to do. What you're trying to do will help determine how to do it.

EDIT:

After reading & rereading your update several times, it seems to me like you're trying to create a vector of gloms. That is, a buffer of variable length that has structured data at the beginning and something else following it.

This is fairly tricky business, and it requires dynamic allocation. You're not using dynamic allocation, though -- you push the objects in to the vector by value. The pointers to those objects will change as the vector is resized and shuffled.

So if this is what you're trying to do, you need to change things around so that you create your glom using new, and push the pointer to the glom on to the vector. This opens a whole pandora's box of trouble though. How big do you make the buffers? How do you parse the glom? How do you handle deep copies, resizes, etc? How do you properly deallocate the gloms without leaking like a sieve? As I said, tricky business. In my line of work we do this kind of thing all the time, and have a bunch of standard practices that we use. These took a lot of effort & a lot of testing with trial & error to get right, and we still find problems. If this is what you're doing, you might consider going another way.


Its not just broken on the first one, that just happens to be the only one which gets corrupted. If you notice, pointers 0 through 7 are pointing to different chunks of memory when they should be all right next to each other.

std::vector can reallocate the space it needs for the dynamic objects at any time you add something to the vector and it doesn't have enough capacity. You can't rely on the pointers (or iterators) still being valid after adding or inserting something that could increase its size.

Theres many solutions like

a) you can reserve space (using vector::reserve) ahead of time so that this dynamic re-allocation doesn't occur, but you have to know the maximum number of objects you'll be adding.

b) wait until all the objects are added before getting the pointers to the objects

c) use an index the object and pointer to the vector as a pair to refer to your objects as oppose to the object pointers directly. Its more work but wont change (assuming you're not inserting/removing objects in the beginning or middle)

d) try to detect when the vector has reallocated its data (vector::capacity returns a different value) and flush your pointer vector and rebuild it

e) use a different container for the objects which doesn't reallocate on changes like std::list and give up random access on your container of base objects, you can still use vector of pointers but now pointers don't become invalidated. (Do you really need random access and pointers to objects?)

f) rethink your design to not require such pointers


You are trying to store pointers to the contents of a vector. This does not work because the vector is designed to move its contents around in memory in order to implement resizing.

Depending on what you need, you might:

  • store indices into the vector instead

  • invent some other kind of "handle"

  • change the vector to store smart pointers to the instances (which would be allocated with 'new' and exist separately), and then store another smart pointer to the same instance. The smart pointers will maintain a reference count, and deallocate the memory once the instances are no longer referred to by either the vector or the outside code. (There exist pre-made solutions for this, of course: see boost::shared_ptr for example.)


push_back invalidates iterators and pointers into a vector. What is happening:

  1. You push to TheInts
  2. Then you take a pointer to just pushed element and store it in another vector

The push in point 1 above may reallocate a vector and invalidates all pointers previously taken to preceding elements. The value they point to is beyond your control.

Why it happens only once - just luck (vector may allocate in larger chunks than currently needed).

Use dequeue or list or reserve the space needed beforehand (if you can be sure how much you need) or get the pointers to the first vector after you finished adding to it (yes, it needs two loops).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜