开发者

Regex range operator

I have a string '11 15 '. W/ a Regex I then compare the values within that string, in this case 11 and 15 (could be any number of digits but I'll keep it simple with 2 2-digit numbers).

For each of those numbers, I then see if it matches any of the numbers I want; in this case I want to see if the number is '12', '13', or '14'. If it is, then I change the value of '$m':

my $string = '11 15 ';
while ( $string =~ /([0-9]{1,})\s+/ig ) {
    my $m = $1;
    print $m . ".....";
    $m = 'change value' if $m =~ /[12...14]{2,}/g;
    print $m . "\n";
}

Produces:

11.....change value
15.....15

'15' stays the same, as it should. But '11' changes. What am I doing 开发者_如何转开发wrong?


[12...14] matches against "1", "2", ".", and "4". "11" Matches that; "15" doesn't. If you're just matching against numbers, you shouldn't be using regular expressions. Change your line to the following:

$m = 'change value' if $m ~~ [11..14];

Or, if unable to guarantee perl >= v5.10:

$m = 'change value' if grep { $m == $_ } 11..14;


You've misunderstood the regular expression. Where you've written [12...14]{2,}, this means "match 2 or more of the characters 1 or 2 or dot or dot or dot or dot or 1 or 4".

Try something like:

$m='change value' if $m=~/(\d{2,})/ and $1 >= 12 and $1 <= 14;

In a substitution operation, this could be written as:

$m =~ s/(\d{2,})/ $1 >= 12 && $1 <= 14 ? 'change value' : $1/ge;

That is, capture 2 or more digits and then test what you have captured to see if they're what you want to change by using perl code in the replacement section of the substitution. The e modifier indicates that Perl should evaluate the replacement as Perl code.


Let's rewrite your code a bit:

my $string = '11 15 ';
while ( $string =~ /(\d+)/g ) {

I've changed your while statement's regular expression. You can use \d+ to represent one or more digits, and that's easier to understand than [0-9]{1,}. You also (since a space won't match \d) don't need the last space on the end of your string.

Let's look at the rest of the code:

my $string = '11 15';
while ( $string =~ /(\d+)/g ) {
    my $match = $1;
    print "$match.....";
    if ($match >= 12 and $match <= 14) {   #if ($match ~~ [12..14]) for Perl > 5.10
        print 'change value\n';
    }
    else {
        print "$match\n";
    }
}

You can't use a regular expression the way you are to test for range.

Instead, use the regular range test of

if ($match >= 12 and $match <= 14)

or the newer group test:

if ($match ~~ [12..14])  #Note only two dots and not three!

That last one only works in newer versions of Perl like 5.12 I have on my Mac, and 5.14 I have on my Linux box, but not the Perl 5.8 I have on my Solaris box).

A few tips:

  • Use indents and spaces. It makes your code more readable.
  • Use descriptive names for variables. Instead of $m, I used $match.
  • Don't use the appended if statements. The appended if is harder to spot, so you might miss something important, and it makes your code harder to update. It can be used if the statement itself is clear and simple, and it improves readability. The last is a bit subjective, but you'll commonly see appended if statements in things like return if not -f $file;.
  • Keep variables single purpose. In this case, instead of changing the value of $match, I used an if/else statement. Imagine if your code was a bit more complex, and someone had to add in a new feature. They see the $match variable and think this is what they need. Unfortunately, you changed what $match is. It's now a value to be printed out and not the string match. It might take the person who changed your program quite a while to figure out what happened to the value of $match and why it has bee mysteriously set to changed value.
  • In the print statement, you can include variables inside of double quotes. This is very different from almost all other languages. This is because Perl variable use sigils to mark variable names. It usually makes it easier to read if your combine variables and other strings in a single string.

For example:

 print "The range of possible values are $low to $high\n";

vs.

 print "The range of possible values are " . $low . " to " . $high . "\n";

Notice how in the second example, I had to be careful of spaces inside the quotes while in the first example, the required spaces came rather naturally. Imagine having to change that statement in a later version of the program. Which would be easier to maintain?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜