C++ equivalent of Python difference_update?
s1 and s2 are sets (Python set or C++ std::set)
To add the elements of s2 to s1 (set union), you can doPython: s1.update(s2)
C++: s1.insert(s2.begin(), s2.end());
To remove the elements of s2 from s1 (set difference), you can do
Python: s1.difference_update(s2)
What is the C++ equivalent of this? The code
s1.erase(s2.begin(), s2.end());
does not work, for s1.erase() requires iterators from s1.The code
std::set<T> s3;
std::set_difference(s1.begin(), s1.end(), s2.begin(), s2.end开发者_开发百科(), std::inserter(s3, s3.end());
s1.swap(s3);
works, but seems overly complex, at least compared with Python.
Is there a simpler way?
Using std::set_difference
is the idiomatic way to do this in C++. You have stumbled across one of the primary differences (pun intended) between C++/STL and many other languages. STL does not bundle operations directly with the data structures. This is why std::set
does not implement a difference routine.
Basically, algorithms such as std::set_difference
write the result of the operation to another object. It is interesting to note that the algorithm does not require that either or both of the operands are actually std::set
. The definition of the algorithm is:
Effects: Copies the elements of the range
[first1, last1)
which are not present in the range[first2, last2)
to the range beginning atresult
. The elements in the constructed range are sorted.Requires: The resulting range shall not overlap with either of the original ranges. Input ranges are required to be order by the same
operator<
.Returns: The end of the constructed range.
Complexity: At most
2 * ((last1 - first1) + (last2 - first2)) - 1
comparisons
The interesting difference is that the C++ version is applicable to any two sorted ranges. In most languages, you are forced to coerce or translate the calling object (left-hand operand) into a set before you have access to the set difference algorithm.
This is not really pertinent to your question, but this is the reason that the various set algorithms are modeled as free-standing algorithms instead of member methods.
You should iterate through the second set:
for( set< T >::iterator iter = s2.begin(); iter != s2.end(); ++iter )
{
s1.erase( *iter );
}
This will could be cheaper than using std::set_difference
- set_difference
copies the unique objects into a new container, but it takes linear time, while .erase
will not copy anything, but is O(n * log( n ) )
.
In other words, depends on the container, you could choose the way, that will be faster for your case.
Thanks David Rodríguez - dribeas
for the remark! (:
EDIT: Doh! I thought about BOOST_FOREACH at the very beginning, but I was wrong that it could not be used.. - you don't need the iterator, but just the value.. As user763305 said by himself/herself.
In c++ there is no difference
method in the set. The set_difference
looks much more awkward as it is more generic than applying a difference on two sets. Of course you can implement your own version of in place difference on sets:
template <typename T, typename Compare, typename Allocator>
void my_set_difference( std::set<T,Compare,Allocator>& lhs, std::set<T,Compare,Allocator> const & rhs )
{
typedef std::set<T,Comapre,Allocator> set_t;
typedef typename set_t::iterator iterator;
typedef typename set_t::const_iterator const_iterator;
const_iterator rit = rhs.begin(), rend = rhs.end();
iterator it = lhs.begin(), end = lhs.end();
while ( it != end && rit != rend )
{
if ( lhs.key_comp( *it, *rit ) ) {
++it;
} else if ( lhs.key_comp( *rit, *it ) ) {
++rit;
} else {
++rit;
lhs.erase( it++ );
}
}
}
The performance of this algorithm will be linear in the size of the arguments, and require no extra copies as it modifies the first argument in place.
You can also do it with remove_if
writing your own functor for testing existence in a set, e.g.
std::remove_if(s1.begin(), s1.end(), ExistIn(s2));
I suppose that set_difference
is more efficient though as it probably scans both sets only once
Python set is unordered, and is more of an equivalent of C++ std::unordered_set than std::set, which is ordered.
David Rodríguez's algorithm relies on the fact that std::set is ordered, so the lhs and rhs sets can be traversed in the way as exhibit in the algorithm.
For a more general solution that works for both ordered and unordered sets, Kiril Kirov's algorithm should be the safe one to adopt if you are enforcing/preserving the "unorderedness" nature of Python set.
精彩评论