What's wrong with this perl matching pattern?
For a svn precommit hook, I want to strip leading spaces (there are surely no tabs present) on lines with certain keywords in them before committing them to the server. For example the line
MACRO_1 (ABC, "Some String");
shall be matched and the leading spaces before MACRO_1
shall be removed. Currently, I have the following expression to match:
if($line =~ /^\s+MACRO_1|MACRO_2|MACRO_3|MACRO_4.*/) {
print "Stripping leading space on line $line\n";
$line =~ s/^\s*//gsxm; # strip leading spaces
}
When I look at the console, I开发者_开发知识库 get the following output:
Stripping leading space on line MACRO_1;
Stripping leading space on line MACRO_2;
Stripping leading space on line MACRO_3 (ABC, "Some String");
Stripping leading space on line MACRO_1;
Stripping leading space on line MACRO_2;
Stripping leading space on line MACRO_1(123);
Stripping leading space on line MACRO_2(123);
Stripping leading space on line MACRO_1;
Stripping leading space on line MACRO_2;
Stripping leading space on line MACRO_1;
This seems a bit confusing to me since there are lines with no leading space that I thought would not be matched by \s+
which is in my opinion one or more spaces.
What's the problem?
The |
has the lowest precedence among the regex operators.
So the following is an incorrect regex to match a string containing either only foo
or only bar
:
^foo|bar$
because it'll be treated as:
(^foo)|(bar$)
which matches any string beginning with foo
or ending in bar
; so it matches for example foo1
and 1bar
.
The correct regex would be ^(?:foo|bar)$
Similarly in your case the correct regex is:
if($line =~ /^\s+(?:MACRO_1|MACRO_2|MACRO_3|MACRO_4).*/) {
Also note that there is no need of the trailing .*
.
You can shorten your regex as:
if($line =~ /^\s+MACRO_[1-4]/) {
精彩评论