What do these characters mean?
Could you please explain the statemen开发者_JAVA百科t below? I think it's called regex, but I'm really not sure.
~<p>(.*?)</p>~si
What does si
and (.*?)
stand for?
Find everything between <p>
and </p>
case insensitive (i
) (so <P>
will work also) and possibly spanning multiple lines (s
)
Actually, it's called regex, short for Regular Expression, and has a syntax that doesn't look familiar at first, but becomes second-nature quickly enough.
si
are flags: s
stands for "dotall", which makes the .
(which I'll explain in a bit) match every single character, including newlines. The i
stands for "case-insensitive", which is self-explanatory.
The (.*?)
part says this: "match every 0 or more repetitions (*
) of any character (.
), and make it greedy lazy (?
) i.e. match as few characters as possible".
The "matching" happens when you check a string against the regex. For example, you say that <p>something</p>
matches the given regex.
You'll find @Mchl's link a great source of information on regex.
Hope this helps.
It's called regex - short for regular expressions, which is a standard for string parsing, manipulation, and validation. Look at the reference section on the site I linked to and you'll be able to work out what that regex does.
It's a lazy regular expression, basically it will try as LITTLE (lazy) as possible with that mask while by default it will try to match as much as it can (greedy).
Check out this resource for a better, more complete explanation:
http://www.regular-expressions.info/repeat.html#greedy
精彩评论