Should I use std::for_each?
I'm always trying to learn more about the languages I use (different styles, frameworks, patterns, etc). I've noticed that I never use std::for_each
so I thought that perhaps I should start. The goal in such cases is to expand my mind and not to improve the code in some measure (readability, expressiveness, compactness, etc).
So with that context in mind, is a good idea to use std::for_each
for simple tasks like, say, printing out a vector:
for_each(v.begin(), v.end(), [](int n) { cout << n << endl; }
(The [](int n)
being a lambda function). Instead of:
for(int i=0; i<v.size(); i++) { cout << v[i] << endl; }
I hope this question doesn't seem pointless. I guess it almost asks a larger question... should an intermediate programmer use a language feature even though he doesn't really need to at this time but just so that he can understand the feature better for a time that may actually greatly benefit from it. Although this larger question has probably already been asked (e.g. h开发者_如何学运维ere).
There is an advantage to using std::for_each
instead of an old school for
loop (or even the newfangled C++0x range-for
loop): you can look at the first word of the statement and you know exactly what the statement does.
When you see the for_each
, you know that the operation in the lambda is performed exactly once for each element in the range (assuming no exceptions are thrown). It isn't possible to break out of the loop early before every element has been processed and it isn't possible to skip elements or evaluate the body of the loop for one element multiple times.
With the for
loop, you have to read the entire body of the loop to know what it does. It may have continue
, break
, or return
statements in it that alter the control flow. It may have statements that modify the iterator or index variable(s). There is no way to know without examining the entire loop.
Herb Sutter discussed the advantages of using algorithms and lambda expressions in a recent presentation to the Northwest C++ Users Group.
Note that you can actually use the std::copy
algorithm here if you'd prefer:
std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, "\n"));
It depends.
The power of for_each
is, that you can use it with any container whose iterators satisfy the input iterator concept and as such it's generically useable on any container. That increases maintainability in a way that you can just swap out the container and don't need to change anything. The same doesn't hold true for a loop over the size
of a vector. The only other containers you could swap it with without having to change the loop would be another random-access one.
Now, if you'd type out the iterator version yourself, the typical version looks like this:
// substitute 'container' with a container of your choice
for(std::container<T>::iterator it = c.begin(); it != c.end(); ++it){
// ....
}
Rather lengthy, eh? C++0x relieves us of that length thing with the auto
keyword:
for(auto it = c.begin(); it != c.end(); ++it){
// ....
}
Already nicer, but still not perfect. You're calling end
on every iteration and that can be done better:
for(auto it = c.begin(), ite = c.end(); it != ite; ++it){
// ....
}
Looks good now. Still, longer than the equivalent for_each
version:
std::for_each(c.begin(), c.end(), [&](T& item){
// ...
});
With "equivalent" being slightly subjective, as the T
in the parameter list of the lambda could be some verbose type like my_type<int>::nested_type
. Though, one can typedef
his/her way around that. Honestly, I still don't understand why lambdas weren't allowed to be polymorphic with type deduction...
Now, another thing to consider is that for_each
, the name itself, already expresses an intent. It says that no elements in the sequence will be skipped, which might be the case with your normal for-loop.
That brings me to another point: Since for_each
is intended to run over the whole sequence and apply an operation on every item in the container, it is not designed to handle early return
s or break
s in general. continue
can be simulated with a return
statement from the lambda / functor.
So, use for_each
where you really want to apply an operation on every item in the collection.
On a side note, for_each
might just be "deprecated" with C++0x thanks to the awesome range-based for-loops (also called foreach loops):
for(auto& item : container){
// ...
}
Which is way shorter (yay) and allows all three options of:
- returning early (even with a return value!)
- breaking out of the loop and
- skipping over some elements.
I generally would recommend use of std::for_each
. Your example for loop does not work for non-random-access containers. You can write the same loop using iterators, but it's usually a pain due to writing out std::SomeContainerName<SomeReallyLongUserType>::const_iterator
as the type of the iteration variable. std::for_each
insulates you from this, and also amortizes the call to end
automatically.
IMHO, you should try this new features in your test code.
In the production code you should try the features which you feel comfortable with. (i.e. if you feel comfortable with for_each
, you can use it.)
for_each
is the most general of the algorithms that iterate over a sequence, and thus the least expressive. If the goal of the iteration can be expressed in terms of transform
, accumulate
, copy
, I feel that it's better to use the specific algorithm rather than the generic for_each
.
With the new C++0x range for (supported in gcc 4.6.0, try that out!), for_each
might even lose its niche of being the most generic way to apply a function to a sequence.
You can use for
loop scoping C++11
For example:
T arr[5];
for (T & x : arr) //use reference if you want write data
{
//something stuff...
}
Where T is every type you want.
It works for every containers in STL and classic arrays.
Well... that works, but for printing a vector (or the content of other container types) I prefer this :
std::copy(v.begin(), v.end(), std::ostream_iterator< int >( std::cout, " " ) );
Boost.Range simplifies the use of standard algorithms. For your example you could write:
boost::for_each(v, [](int n) { cout << n << endl; });
(or boost::copy
with an ostream iterator as suggested in other answers).
Note that the "traditional" example is buggy:
for(int i=0; i<v.size(); i++) { cout << v[i] << endl; }
This assumes that int
can always represent the index of every value in the vector. There are actually two ways this can go wrong.
One is that int
may be of lower rank than std::vector<T>::size_type
. On a 32-bit machine, int
s are typically 32-bits wide but v.size()
will almost certainly be 64 bits wide. If you manage to stuff 2^32 elements into the vector, your index will never reach the end.
The second problem is that you're comparing a signed value (int
) to an unsigned value (std::vector<T>::size_type
). So even if they were of the same rank, when the size exceeds the maximum integer value, then the index will overflow and trigger undefined behavior.
You may have prior knowledge that, for this vector, those error conditions will never be true. But you'd either have to ignore or disable the compiler warnings. And if you disable them, then you don't get the benefit of those warnings helping you find actual bugs elsewhere in your code. (I've spent lots of time tracking down bugs that should have been detected by these compiler warnings, if the code had made it feasible to enable them.)
So, yes, for_each
(or any appropriate <algorithm>
) is better because it avoids this pernicious abuse of int
s. You could also use a range-based for loop or an iterator-based loop with auto.
An additional advantage to using <algorithm>
s or iterators rather than indexes is that it gives you more flexibility to change container types in the future without refactoring all the code that uses it.
精彩评论