Is perl's each function worth using?
From perldoc -f each we read:
There is a single iterator for each hash, shared by all
each
,keys
, andvalues
function calls in the program; it can be reset by reading all the elements from the hash, or by evaluatingkeys HASH
orvalues HASH
.
The iterator is not reset when you leave the scope containing the each()
, and this can lead to bugs:
my %h = map { $_, 1 } qw(1 2 3);
while (my $k = each %h) { print "1: $k\n"; last }
while (my $k = each %h) { print "2: $k\n" }
Output:
1: 1
2: 3
2: 2
What are the common worka开发者_如何学Crounds for this behavior? And is it worth using each
in general?
I think it is worth using as long as you are aware of this. It's ideal when you need both key and value in iteration:
while (my ($k,$v) = each %h) {
say "$k = $v";
}
In your example you can reset the iterator by adding keys %h;
like so:
my %h = map { $_ => 1 } qw/1 2 3/;
while (my $k = each %h) { print "1: $k\n"; last }
keys %h; # reset %h
while (my $k = each %h) { print "2: $k\n" }
From Perl 5.12 each
will also allow iteration on an array.
I find each
to be very handy for idioms like this:
my $hashref = some_really_complicated_method_that_builds_a_large_and_deep_structure();
while (my ($key, $value) = each %$hashref)
{
# code that does stuff with both $key and $value
}
Contrast that code to this:
my $hashref = ...same call as above
foreach my $key (keys %$hashref)
{
my $value = $hashref->{$key};
# more code here...
}
In the first case, both $key
and $value
are immediately available to the body of the loop. In the second case, $value
must be fetched first. Additionally, the list of keys of $hashref
may be really huge, which takes up memory. This is occasionally an issue. each
does not incur such overhead.
However, the drawbacks of each
are not instantly apparent: if aborting from the loop early, the hash's iterator is not reset. Additionally (and I find this one more serious and even less visible): you cannot call keys()
, values()
or another each()
from within this loop. To do so would reset the iterator, and you would lose your place in the while loop. The while loop would continue forever, which is definitely a serious bug.
each
is too dangerous to ever use, and many style guides prohibit its use completely. The danger is that if a cycle of each
is aborted before the end of the hash, the next cycle will start there. This can cause very hard-to-reproduce bugs; the behavior of one part of the program will depend on a completely unrelated other part of the program. You might use each
right, but what about every module ever written that might use your hash (or hashref; it's the same)?
keys
and values
are always safe, so just use those. keys
makes it easier to traverse the hash in deterministic order, anyway, which is almost always more useful. (for my $key (sort keys %hash) { ... }
)
each is not only worth using, it's pretty much mandatory if you want to loop over all of a tied hash too big for memory.
A void-context keys() (or values, but consistency is nice) before beginning the loop is the only "workaround" necessary; is there some reason you are looking for some other workaround?
use the keys()
function to reset the iterator. See the faq for more info
each
has a buit-in, hidden global variable that can hurt you. Unless you need this behavior, it's safer to just use keys
.
Consider this example where we want to group our k/v pairs (yes, I know printf
would do this better):
#!perl
use strict;
use warnings;
use Test::More 'no_plan';
{ my %foo = map { ($_) x 2 } (1..15);
is( one( \%foo ), one( \%foo ), 'Calling one twice works with 15 keys' );
is( two( \%foo ), two( \%foo ), 'Calling two twice works with 15 keys' );
}
{ my %foo = map { ($_) x 2 } (1..105);
is( one( \%foo ), one( \%foo ), 'Calling one twice works with 105 keys' );
is( two( \%foo ), two( \%foo ), 'Calling two twice works with 105 keys' );
}
sub one {
my $foo = shift;
my $r = '';
for( 1..9 ) {
last unless my ($k, $v) = each %$foo;
$r .= " $_: $k -> $v\n";
}
for( 10..99 ) {
last unless my ($k, $v) = each %$foo;
$r .= " $_: $k -> $v\n";
}
return $r;
}
sub two {
my $foo = shift;
my $r = '';
my @k = keys %$foo;
for( 1..9 ) {
last unless @k;
my $k = shift @k;
$r .= " $_: $k -> $foo->{$k}\n";
}
for( 10..99 ) {
last unless @k;
my $k = shift @k;
$r .= " $_: $k -> $foo->{$k}\n";
}
return $r;
}
Debugging the error shown in the tests above in a real application would be horribly painful. (For better output use Test::Differences
eq_or_diff
instead of is
.)
Of course one()
can be fixed by using keys
to clear the iterator at the start and end of the subroutine. If you remember. If all your coworkers remember. It's perfectly safe as long as no one forgets.
I don't know about you, but I'll just stick with using keys
and values
.
It's best if used as it's name: each
. It's probably the wrong thing to use if you mean "give me the first key-value pair," or "give me the first two pairs" or whatever. Just keep in mind that the idea is flexible enough that each time you call it, you get the next pair (or key in a scalar context).
each() can be more efficient if you are iterating through a tied hash, for example a database that contains millions of keys; that way you don't have to load all the keys in memory.
精彩评论