开发者

Store pointers or objects in classes?

Just a design/optimization question. When do you store pointers or objects and why? For example, I believe both of these work (barring compile errors):

class A{
  std::unique_ptr<Object> object_ptr;
};

A::A():object_ptr(new Object()){}

class B{
  Object object;
};

B::B():object(Object()){}

I believe one difference comes when instantiating on stack or heap?

For example:

   int main(){
      std::unique_ptr<A> a_ptr;
      std::unique_ptr<B> b_ptr;
      a_ptr = new A(); //(*object_ptr) on heap, (*a_ptr) on heap?
      b_ptr = new B(); //(*object_ptr) on heap, object on heap?

      A a;   //a on stack, (*object_ptr) on heap?
      B b;   //b on stack, object on stack?
}

Also, sizeof(A) should be < sizeof(B)? Are there any other issues that I am missing? (Daniel reminded me about the inheritance issue in his related开发者_开发技巧 post in the comments)

So since stack allocation is faster than the heap allocation in general, but size is smaller for A than B, are these one of those tradeoffs that cannot be answered without testing performance in the case in question even with move semantics? Or some rules of thumbs When it is more advantageous to use one over the other? (San Jacinto corrected me about stack/heap allocation is faster, not stack/heap)

I would guess that more copy constructing would lead to the same performance issue, (3 copies would ~ 3x similar performance hit as initiating the first instance). But move constructing may be more advantageous to use the stack as much as possible???

Here is a related question, but not exactly the same. C++ STL: should I store entire objects, or pointers to objects?

Thanks!


If you have a big object inside your A class, then I'd store a pointer to it, but for small objects, or primitive types, you should not really need to store pointers, in most cases.

Also, when something is stored on the stack or on the heap (freestore) is really implementation dependent, and A a is not always guarantueed to be on the stack.

It's better to call this an automatic object, because it's storage duration is determined by the scope of the function it is declared in. When the function returns, a will be destroyed.

Pointers require the use of new and it does carry some overhead, but on machines today, I'd say it is trivial in most cases, unless of course you have start newing up millions of objects, then you will start seeing the performance issues.

Each situation is different, and when you should and shouldn't use a pointer, instead of an automatic object, is largely dependent on your situation.


This depends on a lot of specific factors, and either approach can have its merits. I'd say if you will exclusively use the outer object through dynamic allocation, then you might as well make all the members direct members and avoid the additional member allocation. On the other hand, if the outer object is allocated automatically, large members should probably be handled through a unique_ptr.

There's an additional benefit to handling members only through pointers: You remove compile-time dependencies, and the header file for the outer class may be able to get away with a forward-declaration of the inner class, rather than requiring full inclusion of the inner class's header ("PIMPL"). In large projects this sort of decoupling may turn out to be economically sensible.


The heap is not "slower" than the stack. Heap allocation can be slower than stack allocation, and poor cache locality may cause a lot of cache misses if you design your objects and data structures in such a way that there is not a lot of contiguous memory access. So from this standpoint, it depends on what your design and code use goals are.

Even setting this aside, you have to question your copy semantics too. If you want deep copies of your objects (and your objects' objects are also deeply copied), then why even store pointers? If it's okay to have shared memory due to copy semantics, then store pointers but make sure you don't free the memory twice in the dtor.

I tend to use pointers under two conditions: class member initialization order matters deeply, and I'm injecting dependencies into an object. In most other cases, I use non-pointer types.

edit: There are two additional cases when I use pointers: 1) to avoid circular include dependencies (although I may use a reference in some cases), 2) With the intention of using polymorphic function calls.


There are a few cases where you have almost no choice but to store a pointer. One obvious one is when you're creating something like a binary tree:

template <class T>
struct tree_node { 
    struct tree_node *left, *right;
    T data;
}

In this case, the definition is basically recursive, and you don't know up-front how many descendants a tree node might have. You're pretty much stuck with (at least some variation of) storing pointers, and allocating descendant nodes as needed.

There are also cases like dynamic strings where you have only a single object (or array of objects) in the parent object, but its size can vary over a wide enough range that you just about need to (at least provide for the possibility to) use dynamic allocation. With strings, small sizes are common enough that there's a fairly widely-used "short string optimization", where the string object directly includes enough space for strings up to some limit, as well as a pointer to allow dynamic allocation if the string exceeds that size:

template <class T>
class some_string { 
    static const limit = 20;
    size_t allocated;
    size_t in_use;
    union { 
        T short_data[limit];
        T *long_data;
    };

    // ...

};

A less obvious reason to use a pointer instead of directly storing a sub-object is for the sake of exception safety. Just for one obvious example, if you store only pointers in a parent object, that can (usually does) make it trivial to provide a swap for those objects that gives the nothrow guarantee:

template <class T>
class parent { 
    T *data;

    void friend swap(parent &a, parent &b) throw() { 
         T *temp = a.data;
         a.data = b.data;
         b.data = temp;
    }
};

With only a couple of (usually valid) assumptions:

  1. the pointers are valid to start with, and
  2. assigning valid pointers will never throw an exception

...it's trivial for this swap to give the nothrow guarantee unconditionally (i.e., we can just say: "swap will not throw"). If parent stored objects directly instead of pointers, we could only guarantee that conditionally (e.g., swap will throw if and only if the copy constructor or assignment operator for T throws.")

For C++11, using a pointer like this often (usually?) makes it easy to provide an extremely efficient move constructor (that also gives the nothrow guarantee). Of course, using a pointer to (most of) the data isn't the only possible route to fast move construction -- but it is an easy one.

Finally, there are the cases I suspect you had in mind when you asked the question -- ones where the logic involved doesn't necessarily indicate whether you should use automatic or dynamic allocation. In this case, it's (obviously) a judgement call. From a purely theoretical viewpoint, it probably makes no difference at all which you use in these cases. From a practical viewpoint, however, it can make quite a bit of difference. Even though neither the C nor C++ standard guarantees (or even hints at) anything of the sort, the reality is that on most typical systems, objects using automatic allocation will end up on the stack. On most typical systems (e.g., Windows, Linux) the stack is limited to only a fairly small fraction of the available memory (typically on the order of single-digit to low double-digit megabytes).

This means that if all the objects of these types that might exist at any given time might exceed a few megabytes (or so) you need to ensure that (at least most of) the data is allocated dynamically, not automatically. There are two ways to do that: you can either leave it to the user to allocate the parent objects dynamically when/if they might exceed the available stack space, or else you can have the user work with relatively small "shell" objects that allocate space dynamically on the user's behalf.

If that's at all likely to be an issue, it's almost always preferable for the class to handle the dynamic allocation instead of forcing the user to do so. This has two obvious good points:

  1. The user gets to use stack-based resource management (SBRM, aka RAII), and
  2. The effects of limited stack space are limited instead of "percolating" through the whole design.

Bottom line: especially for a template where the type being stored isn't known up-front, I'd tend to favor a pointer and dynamic allocation. I'd reserve direct storage of sub-objects primarily to situations where I know the stored type will (almost?) always be quite small, or where profiling has indicated that dynamic allocation is causing a real speed problem. In the latter case, however, I'd give at least some though to alternatives like overloading operator new for that class.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜