What Perl regex can match the contiguous subset of digits 12345?
I would like a Perl regex that matches any contiguous subset of the string '12345'.
I'm probably just having a brain-freeze, but this is my test code and current best regex. I can see how to brute-force the situation by adding alternatives, but 开发者_StackOverflow社区I'm wondering which elegant alternative I'm missing. [I don't specifically need captures for the digits; I have left the sample regex without non-capturing parentheses to make it slightly less cluttered.]
Test case:
use strict;
use warnings;
my @good = qw( 1 12 123 1234 12345 2 23 234 2345 3 34 345 4 45 5);
my @bad = qw( 0 6 13 134 1345 145 15 124 1245 125 1235 24 245 25
35 21 32 43 54 543 5432 54321);
my $qr = qr/^(1?(2?(3(4(5)?)?)?)?)$/; # 3 'good', 3 'bad' failures
#my $qr = qr/^(1?(2(3(4(5)?)?)?)?)$/; # 6 'good' failures.
my $fail = 0;
foreach my $opt (@good)
{
printf "GOOD %d: $opt - missed by regex\n", ++$fail if ($opt !~ /$qr/);
}
foreach my $opt (@bad)
{
printf "BAD %d: $opt - allowed by regex\n", ++$fail if ($opt =~ /$qr/);
}
print(($fail == 0) ? "PASS\n" : "FAIL\n");
Sample outputs:
Case 1 (commented out):
GOOD 1: 3 - missed by regex
GOOD 2: 34 - missed by regex
GOOD 3: 345 - missed by regex
GOOD 4: 4 - missed by regex
GOOD 5: 45 - missed by regex
GOOD 6: 5 - missed by regex
FAIL
Case 2 (active):
GOOD 1: 4 - missed by regex
GOOD 2: 45 - missed by regex
GOOD 3: 5 - missed by regex
BAD 4: 13 - allowed by regex
BAD 5: 134 - allowed by regex
BAD 6: 1345 - allowed by regex
FAIL
So, can you write a nice simple, symmetric regex that matches what I want and not what I don't?
This regex lets the test case pass, but isn't as elegant as I was hoping for:
my $qr = qr/^((1?(2(3(4(5)?)?)?)?)|(3?(4(5)?)?)|5)$/;
Test case with Justin's solution
use strict;
use warnings;
my @good = qw( 1 12 123 1234 12345 2 23 234 2345 3 34 345 4 45 5);
my @bad = qw( 0 6 13 134 1345 145 15 124 1245 125 1235 24 245 25
35 21 32 43 54 543 5432 54321 11 122 1233 1223 12234);
#my $qr = qr/^(1?(2?(3(4(5)?)?)?)?)$/; # 3 'good', 3 'bad' failures
#my $qr = qr/^(1?(2(3(4(5)?)?)?)?)$/; # 6 'good' failures.
#my $qr = qr/^((1?(2(3(4(5)?)?)?)?)|(3?(4(5)?)?)|5)$/; # Passes
# Ysth's solution - passes
#my $qr = qr/^[12345](?:(?<=1)2|(?<=2)3|(?<=3)4|(?<=4)5)*$/;
my $fail = 0;
foreach my $opt (@good)
{
printf "GOOD %d: $opt - missed by regex\n", ++$fail if ('12345' !~ /$opt/);
#printf "GOOD %d: $opt - missed by regex\n", ++$fail if ($opt !~ /$qr/);
}
foreach my $opt (@bad)
{
printf "BAD %d: $opt - allowed by regex\n", ++$fail if ('12345' =~ /$opt/);
#printf "BAD %d: $opt - allowed by regex\n", ++$fail if ($opt =~ /$qr/);
}
print(($fail == 0) ? "PASS\n" : "FAIL\n");
Reverse the match:
'12345' =~ /$opt/
Here's a revised version of Justin's idea:
index('12345', $opt) >= 0;
Or, if you need to exclude the empty string
index('12345', $opt) >= 0 and length $opt;
This way, you don't need to check $opt
for regex metachars. I'm not sure which version woud be faster.
/^[12345](?:(?<=1)2|(?<=2)3|(?<=3)4|(?<=4)5)*\z/
Sorry, got it wrong twice. This should do it. The explicit list of all possible matches is going to be faster, though.
Do you want something better than
/\b(1|12|123|1234|12345|2|23|234|2345|3|34|345|4|45|5)\b/
?
精彩评论