C++: shared_ptr as unordered_set's key
Consider the following code
#include <boost/unordered_set.hpp>
#include <boos开发者_开发问答t/shared_ptr.hpp>
#include <boost/make_shared.hpp>
int main()
{
boost::unordered_set<int> s;
s.insert(5);
s.insert(5);
// s.size() == 1
boost::unordered_set<boost::shared_ptr<int> > s2;
s2.insert(boost::make_shared<int>(5));
s2.insert(boost::make_shared<int>(5));
// s2.size() == 2
}
The question is: how come the size of s2 is 2 instead of 1? I'm pretty sure it must have something to do with the hash function. I tried looking at the boost docs and playing around with the hash function without luck.
Ideas?
make_shared
allocates a new int
, and wraps a shared_ptr
around it. This means that your two shared_ptr<int>
s point to different memory, and since you're creating a hash table keyed on pointer value, they are distinct keys.
For the same reason, this will result in a size of 2:
boost::unordered_set<int *> s3;
s3.insert(new int(5));
s3.insert(new int(5));
assert(s3.size() == 2);
For the most part you can consider shared_ptr
s to act just like pointers, including for comparisons, except for the auto-destruction.
You could define your own hash function and comparison predicate, and pass them as template parameters to unordered_map
, though:
struct your_equality_predicate
: std::binary_function<boost::shared_ptr<int>, boost::shared_ptr<int>, bool>
{
bool operator()(boost::shared_ptr<int> i1, boost::shared_ptr<int> i2) const {
return *i1 == *i2;
}
};
struct your_hash_function
: std::unary_function<boost::shared_ptr<int>, std::size_t>
{
std::size_t operator()(boost::shared_ptr<int> x) const {
return *x; // BAD hash function, replace with somethign better!
}
};
boost::unordered_set<int, your_hash_function, your_equality_predicate> s4;
However, this is probably a bad idea for a few reasons:
- You have the confusing situation where
x != y
buts4[x]
ands4[y]
are the same. If someone ever changes the value pointed-to by a hash key your hash will break! That is:
boost::shared_ptr<int> tmp(new int(42)); s4[tmp] = 42; *tmp = 24; // UNDEFINED BEHAVIOR
Typically with hash functions you want the key to be immutable; it will always compare the same, no matter what happens later. If you're using pointers, you usually want the pointer identity to be what is matched on, as in extra_info_hash[&some_object] = ...
; this will normally always map to the same hash value whatever some_object
's members may be. With the keys mutable after insertion is it all too easy to actually do so, resulting in undefined behavior in the hash.
Notice that in Boost <= 1.46.0, the default hash_value
of a boost::shared_ptr
is its boolean value, true
or false
.
For any shared_ptr
that is not NULL
, hash_value
evaluates to 1 (one), as the (bool)shared_ptr == true
.
In other words, you downgrade a hash set to a linked list if you are using Boost <= 1.46.0.
This is fixed in Boost 1.47.0, see https://svn.boost.org/trac/boost/ticket/5216 .
If you are using std::shared_ptr
, please define your own hash function, or use boost/functional/hash/extensions.hpp
from Boost >= 1.51.0
As you found out, the two objects inserted into s2
are distinct.
精彩评论