开发者

invasive vs non-invasive ref-counted pointers in C++

For the past few years, I've开发者_如何学编程 generally accepted that

if I am going to use ref-counted smart pointers

invasive smart pointers is the way to go

--

However, I'm starting to like non-invasive smart pointers due to the following:

  1. I only use smart pointers (so no Foo* lying around, only Ptr)
  2. I'm starting to build custom allocators for each class. (So Foo would overload operator new).
  3. Now, if Foo has a list of all Ptr (as it easily can with non-invasive smart pointers).
  4. Then, I can avoid memory fragmentation issues since class Foo move the objects around (and just update the corresponding Ptr).

The only reason why this Foo moving objects around in non-invasive smart pointers being easier than invasive smart pointers is:

In non-invasive smart pointers, there is only one pointer that points to each Foo.

In invasive smart pointers, I have no idea how many objects point to each Foo.

Now, the only cost of non-invasive smart pointers ... is the double indirection. [Perhaps this screws up the caches].

Does anyone have a good study of expensive this extra layer of indirection is?

EDIT: by smart pointers, I may be referring to what others call "shared-pointers"; the whole idea is: there is a reference-count attached to objects, and when it hits 0, the object is automatically deleted


There are several important difference between invasive or non-invasive pointers:

The biggest advantage of second (non-invasive):

  • It is much simpler to implement weak reference to second one (i.e. shared_ptr/weak_ptr).

Advantage of first is when you need to get smart pointer on this (at least in case of boost::shared_ptr, std::tr1::shared_ptr)

  • You can't use shared_ptr from this in constructor and destructor.
  • It is quite non-trivial to have shared_from this in the hierarchy of classes.


Well, first and foremost, I shall remind you that shared ownership is usually a difficult beast to tame and can lead to quite difficult to root out bugs.

There are many ways not to have shared ownership. The Factory approach (implemented itself using a Boost Pointer Container) is personally one of my favorite.

Now, as for the reference counting goes....

1. Intrusive Pointers

The counter is embedded in the object itself which means:

  • you need to provide methods to add / substract to the counter, and it is your duty to make them thread-safe.
  • the count does not survive the object, so no weak_ptr, so you cannot have cycles of references in your design without using an Observer pattern... quite complicated

2. Non-Intrusive Pointers

I will only talk about boost::shared_ptr and boost::weak_ptr. I have dug into the source a bit recently to precisely look at the mechanics and there are indeed much more complicated that the above!

// extract of <boost/shared_ptr.hpp>

template <class T>
class shared_ptr
{
  T * px;                     // contained pointer
  boost::detail::shared_count pn;    // reference counter
};
  • Maintenance of the count has already been written out for you and is thread-safe.
  • You can use weak_ptr in case of cyclic references.
  • Only the one building the shared_ptr object needs to know about the object destructor (see Example)

Here is a little example, to illustrate this forward declaration magic:

 // foofwd.h
 #include <boost/shared_ptr.hpp>

 class Foo;

 typedef boost::shared_ptr<Foo> foo_ptr;

 foo_ptr make_foo();

 // foo.h
 #include "foofwd.h"

 class Foo { /** **/ };

 // foo.cpp
 #include "foo.h"

 foo_ptr make_foo() { return foo_ptr(new Foo()); }

 // main.cpp
 #include "foofwd.h"

 int main(int argc, char* argv[])
 {
   foo_ptr p = make_foo();
 } // p.get() is properly released

There is a little bit of template magic to authorized this. Basically the counter object embeds a disposer* (yet a third allocation) that allows some type erasure to take place. Really useful though, since it really allows for forward declaration!

3. Conclusion

While I agree that Intrusive Pointers are probably faster as in less allocation going on (there are 3 different blocks of memory allocated for a shared_ptr), there are also less practical.

So I'd like to point you out to the Boost Intrusive Pointer library, and more especially to its introduction:

As a general rule, if it isn't obvious whether intrusive_ptr better fits your needs than shared_ptr, try a shared_ptr-based design first.


I don't know of a study on the additional cost due to non-invasive over invasive. But I would note that non-invasive seems to be universally recommended by the C++ experts. Of course, this may mean nothing! But the reasoning is pretty sound: if you need smart pointers, it's because you want a simpler way to implement object lifetime management instead of writing it by hand, so you are emphasising correctness and simplicity over performance, which is always a good idea until you have profiled a realistic model of your whole design.

It could well be that in a simplified test, non-invasive is twice as slow as invasive, and yet in a real program that does actual work, this speed difference is lost in the noise and becomes so insignificant you can't even measure it. This is quite a common phenomenon; the things you imagine are significant to performance very often aren't.

If you find a performance bottleneck, it's possible (likely?) that the work of maintaining the reference count itself (in both approaches) will have as much effect on performance as the extra indirection in the non-invasive approach. With raw pointers, the statement:

p1 = p2;

probably only needs to move a value between two CPU registers, after the optimiser has worked its magic. But if they are reference counting smart pointers, even with invasive it's like:

if (p1 != p2)
{
    if ((p1 != 0) && (--(p1->count) == 0))
        delete p1;

    p1 = p2;

    if (p1 != 0)
        p1->count++;
}

This happens with every smart pointer argument passed to every function. So there are lots of extra accesses to potentially far-flung areas of memory, to raise and lower the count each time. And to be thread-safe, the increment and decrement-check operations have to be interlocked/atomic, which can have a severely negative effect on multiple cores.

I think of the "sweet spot" of C++ as being those situations where you don't need to manage wildly dynamic data structures like this. Instead you have a simple hierarchical pattern of object ownership, so there is an obvious single owner of each object, and the lifetime of data tends to follow the lifetime of function invocations (more often than not). You can then let the standard containers and the function call stack manage everything for you. This is emphasised in the forthcoming version of the language with rvalue references, unique_ptr, etc. which is all about transferring around the single-ownership of an object in an easy way. If you really need dynamic multi-owner lifetime management, true GC would be faster and easier to use correctly, but C++ isn't a very happy home for GC.

Another minor point: unfortunately "In non-invasive smart pointers, there is only one pointer that points to each Foo" is untrue. Inside Foo there is the this pointer that is a Foo * and so naked pointers can still be leaked out, often in quite hard-to-spot ways.


The only real cost of non-invasive ref-counting w.r.t. performance is that you sometimes require one extra allocation for the ref-counter. As far as I know, tr1::shared_ptr implementations don't do "double indirection". I suppose, it would be difficult to support conversions without letting shared_ptr store the pointer directly. A sensible shared_ptr implementation would store two pointers: One pointer to the object (no double indirection) and one pointer to some control structure.

Even the allocation overhead isn't necessary in all situations. See make_shared. C++0x will also provide a make_shared function that allocates both, the object and the ref-counter in one go which is similar in that respect to the intrusive ref-counting alternative.

[...] Besides convenience and style, such a function is also exception safe and considerably faster because it can use a single allocation for both the object and its corresponding control block, eliminating a significant portion of shared_ptr's construction overhead. This eliminates one of the major efficiency complaints about shared_ptr. [...]

In the light of shared_ptr and make_shared, I have a hard time coming up with problems where an intrusive smart pointer would beat shared_ptr significantly. Copying and destroying shared pointers might be a tad slower, though. Having said that, let me add that I rarely use these kinds of smart pointers. Most of the time unique owhership is all I need.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜