开发者

some logic inside regular expression in perl

In previous question, I have asked multiple matching patterns. Now my question is:

I have a few matching patterns:

$text =~ m#finance(.*?)end#s; (1)

$text =~ m#<class>(.*?)</class>#s; (2)

$text =~ m#/data(.*?)<end>#s; (3)

$text =~ m#/begin(.*?)</begin>#s; (4)

I want to match (1), (2) and (3) first. However, after matching (1) or (2), if (4) appears before another (1) or (2), then do not match (3) bu开发者_如何学Got only (4). So essentially (4)'s appearance excludes (3) from being matched. But in the case no (4) appears, (3) is matched. Is there any good way to do this?

Many thanks.


There's one unclear point in your specification: is suppression of (3) only from matching (4) to matching (1)/(2), or wider in scope?

In any case, that one's best resolved with a state machine.

my $state = 0;
while ($text =~ m#(?: finance (.*?) end
                  |   <class> (.*?) </class>
                  |   data    (.*?) </end>
                  |   begin   (.*?) </begin>
                  )
                 #sgx) {
  if (defined $1) {
    $state = ($state & ~4) | 1;
    print $1;
  }
  elsif (defined $2) {
    $state = ($state & ~4) | 2;
    print $2;
  }
  elsif (defined $3 and !($state & 4)) {
    print $3;
  }
  elsif (defined $4) {
    print $4;
    if ($state & 3) { # 1 OR 2
      $state = 4; # set 4, clear 1 and 2
    }
  }
  else {
    die 'Someone modified me without extending the state machine!';
  }
}

(This is syntax checked, but not tested; it's complex enough that a sample data set would be useful.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜