Perl Regex Explanation
Hey So I am not good with regex right now, trying to learn though, can someone explain this 开发者_StackOverflowone out for me bit by bit?
if ($fileStrings[$stringCount] =~ m/((?:include|require)(?:_once)?\s*\(.*?\$.*?\);)/gi)
Thanks
m/((?:include|require)(?:_once)?\s*(.?\$.?);)/gi
m
match for
/
pattern delimiter
(?:include|require)
match but not capture 'include' or 'require'
(?:_once)?
optionally match for but not capture '_once'
\s*
0 or more spaces or tabs, other "whitespace" characters
(.?\$.?)
match and capture 0 or 1 of any character, followed by literal $ character,
followed by 0 or 1 of any character
;
match for semicolon
(...)
outer parenthesis - capture whole thing
/
pattern delimiter
gi
global, case-insensitive search
I usually find it easy to write a test program to check my thoughts. Maybe this will help you understand what the regex is doing:
#! /usr/bin/env perl
use warnings;
use strict;
use feature qw(say);
for my $line (
'include_once F$G;',
'require_once F$G;',
'INCLUDE F$G;',
'include_once AF$G;',
'include_once F$G;',
) {
if ($line =~ m/((?:include|require)(?:_once)?\s*(.?\$.?);)/gi) {
say qq(Line = "$line");
say qq(\$1 = "$1");
say qq(\$2 = "$2"\n);
}
else {
say qq(Line = "$line");
say "No match!\n";
}
}
And the output is:
Line = "include_once F$G;"
$1 = "include_once F$G;"
$2 = "F$G"
Line = "require_once F$G;"
$1 = "require_once F$G;"
$2 = "F$G"
Line = "INCLUDE F$G;"
$1 = "INCLUDE F$G;"
$2 = "F$G"
Line = "include_once AF$G;"
No match!
Line = "include_once F$G;"
$1 = "include_once F$G;"
$2 = "F$G"
The parentheses are used to capture parts of the regular expression is the variables $1
, $2
, $3
, etc. The ?:
doesn't allow the parentheses to capture that part (thus, you have $2
instead of $4
with the value). However, the outer parentheses capture the entire line despite the ?:
.
It looks like the g
parameter at the end allows for multiple lines to be captured. However, that didn't work in my tests.
精彩评论