Trouble understanding ?:, ?=, ?!, and backreferences . .
I'm learning about regular expressions for a script I will be writing down the road, but I've hit a stopping point. I basically understand what ?=
and ?!
do, they're "lookaheads". To borrow and example: /Win (?=98)/
only matches "Win " if it is followed by "98" whereas /Win (?!XP)/
would match "Win " only if it was not followed by "XP" . . . right?
Now I really don't get the ?:
delimiter. And I haven't found a decent example of it and I'm just really, really confused about it. :/ I understand it's supposed to match the entire contained pattern or something?
One more thing I'm confused about are backreferences. Here's on example I found: the regular expression /<(\S+).*>(.*)<\/\1>/
is supposed to match "any tag". I'm just confused as to what the number "1" refers the browser to . . . is it the first match - in which case I would think it wou开发者_如何学编程ld refer to the <
character - or something else?
I'm just now dabbling into the world of regular expressions and would love some clarification on these concepts, thank you all in advance!
Your thoughts on the lookahead assertions are correct.
\1
refers to the first submatch in parentheses, i.e. whatever was matched by the (\S+)
in your example. \2
refers to the second (in the example, (.*)
) and so on.
?:
, on the other hand, means that that set of parentheses should not be tied to a reference like \1
. You use it if you need parenthesis for something but don't really care about getting the matched text later on. So, in the regular expression /(?:abc)def(ghi)/
, \1
would not expand to abc
(because we switched that off using the ?:
), but to ghi
.
Now I really don't get the ?: delimiter. And I haven't found a decent example of it and I'm just really, really confused about it. :/ I understand it's supposed to match the entire contained pattern or something?
The ?: operator allows you to group parts of a regex (just like bare parentheses do), without capturing them. In fact, it's basically called a "non-capturing group".
So for example:
/^(.*?)(foo|bar)(.*?)$/
Result: First capturing group contains the text before the match, second contains either "foo" or "bar", third contains the rest.
/^(.*?)(?:foo|bar)(.*?)$/
Result: First capturing group contains the text before the match, second contains the text after.
精彩评论