some logic inside regular expression in perl
In previous question, I have asked multiple matching patterns. Now my question is:
I have a few matching patterns:
$text =~ m#finance(.*?)end#s;
(1)
$text =~ m#<class>(.*?)</class>#s;
(2)
$text =~ m#/data(.*?)<end>#s;
(3)
$text =~ m#/begin(.*?)</begin>#s;
(4)
I want to match (1), (2) and (3) first. However, after matching (1) or (2), if (4) appears before another (1) or (2), then do not match (3) bu开发者_如何学Got only (4). So essentially (4)'s appearance excludes (3) from being matched. But in the case no (4) appears, (3) is matched. Is there any good way to do this?
Many thanks.
There's one unclear point in your specification: is suppression of (3) only from matching (4) to matching (1)/(2), or wider in scope?
In any case, that one's best resolved with a state machine.
my $state = 0;
while ($text =~ m#(?: finance (.*?) end
| <class> (.*?) </class>
| data (.*?) </end>
| begin (.*?) </begin>
)
#sgx) {
if (defined $1) {
$state = ($state & ~4) | 1;
print $1;
}
elsif (defined $2) {
$state = ($state & ~4) | 2;
print $2;
}
elsif (defined $3 and !($state & 4)) {
print $3;
}
elsif (defined $4) {
print $4;
if ($state & 3) { # 1 OR 2
$state = 4; # set 4, clear 1 and 2
}
}
else {
die 'Someone modified me without extending the state machine!';
}
}
(This is syntax checked, but not tested; it's complex enough that a sample data set would be useful.)
精彩评论