How do I substitute overlapping matches with a Perl regex?
I want to find a开发者_StackOverflow社区ll occurences of "BBB"
in a string and substitute them with "D"
. For example, I have "ABBBBC"
and want to produce "ADBC"
and "ABDC"
. (First substitute the first BBB
, and then substitute the other BBB
). Is there a nice way to do this in Perl?
$str = "ABBBBC";
for ( $str =~ m/B(?=BB)/g ) {
# I match both the BBBs here, but how to substitute the relevant part?
}
I want to get this array: ('ADBC', 'ABDC')
, which comes from changing either of the BBB
s to a D
. The string "ABBBBBC"
would give me "ADBBC"
, "ABDBC"
and "ABBDC"
.
To get overlapping matches, you have to play around with Perl's pos
operator.
pos SCALAR
pos
Returns the offset of where the lastm//g
search left off for the variable in question ($_
is used when the variable is not specified). Note that 0 is a valid match offset.undef
indicates that the search position is reset (usually due to match failure, but can also be because no match has yet been run on the scalar).
pos
directly accesses the location used by the regexp engine to store the offset, so assigning to pos will change that offset, and so will also influence the\G
zero-width assertion in regular expressions. Both of these effects take place for the next match, so you can't affect the position withpos
during the current match, such as in(?{pos() = 5})
ors//pos() = 5/e
.Setting
pos
also resets the matched with zero-length flag, described under Repeated Patterns Matching a Zero-length Substring in perlre.Because a failed
m//gc
match doesn't reset the offset, the return frompos
won't change either in this case. See perlre and perlop.
For example:
#! /usr/bin/env perl
use strict;
use warnings;
my $str = "ABBBBC";
my @replaced;
while ($str =~ m/^(.*)\G(.+?)BBB(.*)$/g ) {
push @replaced, $1 . $2 . "D" . $3;
pos($str) = length($1) + 1;
}
print "[", join("][" => @replaced), "]\n";
Output:
$ ./prog [ADBC][ABDC]
local our @replaced;
'ABBBBC' =~ /^(.*)BBB(.*)\z(?{ push @replaced, $1.'D'.$2 })(?!)/s;
精彩评论