In-place C++ set intersection

2022-12-12 06:45 问答作者：

The standard way of intersecting two sets in C++ is to do the following:

std::set<int> set_1;  // With some elements
std::set<int> set_2;  // With some other elements
std::set<int> the_intersection;  // Destination of intersect
std:开发者_如何学JAVA:set_intersection(set_1.begin(), set_1.end(), set_2.begin(), set_2.end(), std::inserter(the_intersection, the_intersection.end()));

How would I go about doing an in-place set intersection? That is, I want set_1 to have the results of the call to set_intersection. Obviously, I can just do a set_1.swap(the_intersection), but this is a lot less efficient than intersecting in-place.

I think I've got it:

std::set<int>::iterator it1 = set_1.begin();
std::set<int>::iterator it2 = set_2.begin();
while ( (it1 != set_1.end()) && (it2 != set_2.end()) ) {
    if (*it1 < *it2) {
        set_1.erase(it1++);
    } else if (*it2 < *it1) {
        ++it2;
    } else { // *it1 == *it2
            ++it1;
            ++it2;
    }
}
// Anything left in set_1 from here on did not appear in set_2,
// so we remove it.
set_1.erase(it1, set_1.end());

Anyone see any problems? Seems to be O(n) on the size of the two sets. According to cplusplus.com, std::set erase(position) is amortized constant while erase(first,last) is O(log n).

You can easily go through set_1, check each element to see if it exists in set_2, and erase it if it doesn't. Since sets are sorted, you can compare them in linear time, and erasing an element using an iterator is amortized constant time. I wouldn't count on it being more efficient than what you started with though, benchmarking would be wise if it matters to you.

It's not directly answers the question, but maybe someone find this helpful.

In case of std::vector it is not safe to use standard algorithm with set_1.begin() as output iterator (see below), while clang/gcc/microsoft implementations would work. Note, set_2 could be anything, not just a std::vector.

std::vector<int> set_1;  // With some elements
std::vector<int> set_2;  // With some other elements
auto end = std::set_intersection(
                     set_1.begin(), set_1.end(), 
                     set_2.begin(), set_2.end(), 
                     set_1.begin() // intersection is written in set_1
                    );
set_1.erase(end, set_1.end()); // erase redundant elements

Update:

Thanks to @Keith who found that C++ Standard (25.4.5.3) requires next:

The resulting range shall not overlap with either of the original ranges

So what I initially proposed was wrong, but working solution in major STL implementations. If you want to be on safe side and don't want extra allocations then copy implementation of your choice to you code base and use it instead of std::set_intersection. I don't really understand reasons for such restriction, please comment if you know the answer.

继续阅读：stl

In-place C++ set intersection

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？