开发者

C++: shared_ptr as unordered_set's key

Consider the following code

#include <boost/unordered_set.hpp>
#include <boos开发者_开发问答t/shared_ptr.hpp>
#include <boost/make_shared.hpp>

int main()
{
    boost::unordered_set<int> s;
    s.insert(5);
    s.insert(5);
    // s.size() == 1 

    boost::unordered_set<boost::shared_ptr<int> > s2;
    s2.insert(boost::make_shared<int>(5));
    s2.insert(boost::make_shared<int>(5));
    // s2.size() == 2
}

The question is: how come the size of s2 is 2 instead of 1? I'm pretty sure it must have something to do with the hash function. I tried looking at the boost docs and playing around with the hash function without luck.

Ideas?


make_shared allocates a new int, and wraps a shared_ptr around it. This means that your two shared_ptr<int>s point to different memory, and since you're creating a hash table keyed on pointer value, they are distinct keys.

For the same reason, this will result in a size of 2:

boost::unordered_set<int *> s3;
s3.insert(new int(5));
s3.insert(new int(5));
assert(s3.size() == 2);

For the most part you can consider shared_ptrs to act just like pointers, including for comparisons, except for the auto-destruction.

You could define your own hash function and comparison predicate, and pass them as template parameters to unordered_map, though:

struct your_equality_predicate
    : std::binary_function<boost::shared_ptr<int>, boost::shared_ptr<int>, bool>
{
    bool operator()(boost::shared_ptr<int> i1, boost::shared_ptr<int> i2) const {
        return *i1 == *i2;
    }
};

struct your_hash_function
    : std::unary_function<boost::shared_ptr<int>, std::size_t>
{
    std::size_t operator()(boost::shared_ptr<int> x) const {
        return *x; // BAD hash function, replace with somethign better!
    }
};

boost::unordered_set<int, your_hash_function, your_equality_predicate> s4;

However, this is probably a bad idea for a few reasons:

  1. You have the confusing situation where x != y but s4[x] and s4[y] are the same.
  2. If someone ever changes the value pointed-to by a hash key your hash will break! That is:

    boost::shared_ptr<int> tmp(new int(42));
    s4[tmp] = 42;
    *tmp = 24; // UNDEFINED BEHAVIOR
    

Typically with hash functions you want the key to be immutable; it will always compare the same, no matter what happens later. If you're using pointers, you usually want the pointer identity to be what is matched on, as in extra_info_hash[&some_object] = ...; this will normally always map to the same hash value whatever some_object's members may be. With the keys mutable after insertion is it all too easy to actually do so, resulting in undefined behavior in the hash.


Notice that in Boost <= 1.46.0, the default hash_value of a boost::shared_ptr is its boolean value, true or false. For any shared_ptr that is not NULL, hash_value evaluates to 1 (one), as the (bool)shared_ptr == true.

In other words, you downgrade a hash set to a linked list if you are using Boost <= 1.46.0.

This is fixed in Boost 1.47.0, see https://svn.boost.org/trac/boost/ticket/5216 .

If you are using std::shared_ptr, please define your own hash function, or use boost/functional/hash/extensions.hpp from Boost >= 1.51.0


As you found out, the two objects inserted into s2 are distinct.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜