开发者

What does this following code do?

What does this following do? Can anybody explain me?

$data = "What is the STATUS of your mind right now?";

$data =~/.*/; print "$1,$2\n"开发者_StackOverflow中文版;

$data =~/(.*?)(u+).*/; print "$1, $2\n";

$data =~/(.?)(u+).*/; print "$1, $2\n";

$data =~/(\w+\s)+/; print "$1, $2\n";

What is $1 and $2? How does this get it's value? and what are all these regular expressions?

Thanks in advance :)


Please read perldoc perlretut, which will answer all your questions.

The general reference for Perl regular expressions is perldoc perlre, but you should read the tutorial first as it serves as a nicer introduction.


$1 and $2 are matching variables. They refer to whatever is matched in the various parentheses matching groups in the last regular expression.

$1 has the part of the string that was matched in the first parenthesis group. $2 has the part of the string that was matched in the second parenthesis group. You can guess what $3 would contain.

Lets look at your example:

$data = "What is the STATUS of your mind right now?";
$data =~/.*/; print "$1,$2\n";

There are no parentheses here, so $1 and $2 don't contain anything.

$data =~/(.*?)(u+).*/; print "$1, $2\n";

There are two parentheses groups here. The first one is (.*?), which matches nothing or anything it can (in a non-greedy manner, but that's another topic). The second one is (u+) which matches one or more *"u"*s.

The first (and only) "u" in $data is in the middle of "you", so $1 matches everying up until the first "u", and $2 matches that one "u".

$data =~/(.?)(u+).*/; print "$1, $2\n";

Now the first group is (.?), which matches one single character, or nothing. Then (u+) again matches one or more *"u"*s.

Since there's just one "u" in our string, the first group will be the one single character before it, which is "o", and the second group will match the actual "u"

$data =~/(\w+\s)+/; print "$1, $2\n";

Finally, the first group matches (\w+\s)+, which is one or more "word" characters followed by a whitespace character. "Word" characters are any alphanumeric character or the underscore. There is no second group, but there is that + (one or more) symbol.

So what does it match up to? This is a weird one, and I'm not sure if my understanding is 100% accurate. Since the entire matching group has the +, it will gobble up as much of the string as it can and still match the \w+\s. In this case it's able to ignore everything up until the "right ", which it then matches as $1.

Then, because it has the +, it will look for any more matches immediately afterward, but since the "right " is the rightmost string matched, it won't ever find another group match.

So $1 is "right ", and $2 is empty.

Summary:

When you see $1 and $2, you should look for the matching group parentheses in the last regular expression.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜