开发者

Weird behaviour with vector::erase and std::remove_if with end range different from vector.end()

I need to remove elements from the middle of a std::vector.

So I tried:

struct IsEven {
    bool operator()(int ele)
    {
        return ele % 2 == 0;
    }
};

    int elements[] = {1, 2, 3, 4, 5, 6};
    std::vector<int> ints(elements, elements+6);

    std::vector<int>::iterator it = std::remove_if(ints.begin() + 2, ints.begin() + 4, IsEven());
    ints.erase(it, ints.end());

After this I would expect that the ints vector have: [1开发者_运维技巧, 2, 3, 5, 6].

In the debugger of Visual studio 2008, after the std::remove_if line, the elements of ints are modified, I'm guessing I'm into some sort of undefined behaviour here.

So, how do I remove elements from a Range of a vector?


Edit: Sorry, the original version of this was incorrect. Fixed.

Here's what's going on. Your input to remove_if is:

1  2  3  4  5  6
      ^     ^
    begin  end

And the remove_if algorithm looks at all numbers between begin and end (including begin, but excluding end), and removes all elements between that match your predicate. So after remove_if runs, your vector looks like this

1  2  3  ?  5  6
      ^  ^
  begin  new_end

Where ? is a value that I don't think is deterministic, although if it's guaranteed to be anything it would be 4. And new_end, which points to the new end of the input sequence you gave it, with the matching elements now removed, is what is returned by std::remove_if. Note that std::remove_if doesn't touch anything beyond the subsequence that you gave it. This might make more sense with a more extended example.

Say that this is your input:

1  2  3  4  5  6  7  8  9  10
      ^              ^
    begin           end

After std::remove_if, you get:

1  2  3  5  7  ?  ?  8  9  10
      ^        ^
    begin      new_end

Think about this for a moment. What it has done is remove the 4 and the 6 from the subsequence, and then shift everything within the subsequence down to fill in the removed elements, and then moved the end iterator to the new end of the same subsequence. The goal is to satisfy the requirement that the (begin, new_end] sequence that it produces is the same as the (begin, end] subsequence that you passed in, but with certain elements removed. Anything at or beyond the end that you passed in is left untouched.

What you want to get rid of, then, is everything between the end iterator that was returned, and the original end iterator that you gave it. These are the ? "garbage" values. So your erase call should actually be:

ints.erase(it, ints.begin()+4);

The call to erase that you have just erases everything beyond the end of the subsequence that you performed the removal on, which isn't what you want here.

What makes this complicated is that the remove_if algorithm doesn't actually call erase() on the vector, or change the size of the vector at any point. It just shifts elements around and leaves some "garbage" elements after the end of the subsequence that you asked it to process. This seems silly, but the whole reason that the STL does it this way is to avoid the problem with invalidated iterators that doublep brought up (and to be able to run on things that aren't STL containers, like raw arrays).


Erasing elements in std::vector invalidates iterators past the removed element, so you cannot use "foreign" functions that accept ranges. You need to do that in a different way.

EDIT:

In general, you can use the fact that erasing one element "shifts" all elements at further positions one back. Something like this:

for (size_t scan = 2, end = 4; scan != end; )
  {
     if (/* some predicate on ints[scan] */)
       {
         ints.erase (ints.begin () + scan);
         --end;
       }
     else
       ++scan;
  }

Note that std::vector isn't suited for erasing elements in the middle. You should consider something else (std::list?) if you do that often.

EDIT 2:

As clarified by comments, first paragraph is not true. In such case std::remove_if should be more efficient than what I suggested in the first edit, so disregard this answer. (Keeping it for the comments.)


The behavior isn't weird - you're erasing the wrong range. std::remove_if moves elements it "removes" to the end of the input range. In this case, what you're looking for would be to do:

ints.erase(it, ints.begin() + 4 /* your end of range */);

From C++ in a Nutshell:

The remove_if function template "removes" items for which pred returns false from the range [first, last). The return value is one past the new end of the range. The relative order of items that are not removed is stable.

Nothing is actually erased from the underlying container; instead, items to the right are assigned to new positions so they overwrite the elements for which pred returns false. See Figure 13-13 (under remove_copy) for an example of the removal process.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜