开发者

Where is $_ being modified in this perl code?

The following perl code generates a warning in PerlCritic (by Activestate):

sub natural_sort {
    my @sorted;
    @sorted = grep {s/(^|\D)0+(\d)/$1$2/g,1} sort grep {s/(\d+)/sprintf"%06.6d",$1/ge,1} @_;
}

The warning generated is:

Don't mod开发者_运维问答ify $_ in list functions

More info about that warning here

I don't understand the warning because I don't think I'm modifying $_, although I suppose I must be. Can someone explain it to me please?


Both of your greps are modifying $_ because you're using s//. For example, this:

grep {s/(^|\D)0+(\d)/$1$2/g,1}

is the same as this:

grep { $_ =~ s/(^|\D)0+(\d)/$1$2/g; 1 }

I think you'd be better off using map as you are not filtering anything with your greps, you're just using grep as an iterator:

sub natural_sort {
    my $t;
    return map { ($t = $_) =~ s/(^|\D)0+(\d)/$1$2/g; $t }
           sort
           map { ($t = $_) =~ s/(\d+)/sprintf"%06.6d",$1/ge; $t }
           @_;
}

That should do the same thing and keep critic quiet. You might want to have a look at List::MoreUtils if you want some nicer list operators than plain map.


You are doing a substitution ( i.e. s/// ) in the grep, which modifies $_ i.e. the list being grepped.


This and other cases are explained in perldoc perlvar:

Here are the places where Perl will assume $_ even if you don't use it:

  • The following functions:

abs, alarm, chomp, chop, chr, chroot, cos, defined, eval, exp, glob, hex, int, lc, lcfirst, length, log, lstat, mkdir, oct, ord, pos, print, quotemeta, readlink, readpipe, ref, require, reverse (in scalar context only), rmdir, sin, split (on its second argument), sqrt, stat, study, uc, ucfirst, unlink, unpack.

  • All file tests (-f , -d ) except for -t , which defaults to STDIN. See -X

  • The pattern matching operations m//, s/// and tr/// (aka y///) when used without an =~ operator.

  • The default iterator variable in a foreach loop if no other variable is supplied.

  • The implicit iterator variable in the grep() and map() functions.

  • The implicit variable of given().

  • The default place to put an input record when a operation's result is tested by itself as the sole
    criterion of a while test. Outside a
    while test, this will not happen.


Many people have correctly answered that the s operator is modifying $_, however in the soon to be released Perl 5.14.0 there will be the new r flag for the s operator (i.e. s///r) which rather than modify in-place will return the modified elements. Read more at The Effective Perler . You can use perlbrew to install this new version.

Edit: Perl 5.14 is now available! Announcement Announcement Delta

Here is the function suggested by mu (using map) but using this functionality:

use 5.14.0;

sub natural_sort {
    return map { s/(^|\D)0+(\d)/$1$2/gr }
           sort
           map { s/(\d+)/sprintf"%06.6d",$1/gre }
           @_;
}


The VERY important part that other answers have missed is that the line

grep {s/(\d+)/sprintf"%06.6d",$1/ge,1} @_;

Is actually modifying the arguments passed into the function, and not copies of them.

grep is a filtering command, and the value in $_ inside the code block is an alias to one of the values in @_. @_ in turn contains aliases to the arguments passed to the function, so when the s/// operator performs its substitution, the change is being made to the original argument. This is shown in the following example:

sub test {grep {s/a/b/g; 1} @_}

my @array = qw(cat bat sat);

my @new = test @array;

say "@new";   # prints "cbt bbt sbt" as it should
say "@array"; # prints "cbt bbt sbt" as well, which is probably an error

The behavior you are looking for (apply a function that modifies $_ to a copy of a list) has been encapsulated as the apply function in a number of modules. My module List::Gen contains such an implementation. apply is also fairly simple to write yourself:

sub apply (&@) {
    my ($sub, @ret) = @_;
    $sub->() for @ret;
    wantarray ? @ret : pop @ret
}

With that, your code could be rewritten as:

sub natural_sort {
    apply {s/(^|\D)0+(\d)/$1$2/g} sort apply {s/(\d+)/sprintf"%06.6d",$1/ge} @_
}

If your goal with the repeated substitutions is to perform a sort of the original data with a transient modification applied, you should look into a Perl idiom known as the Schwartzian transform which is a more efficient way of achieving that goal.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜