Please explain this regex
I have regex whi开发者_StackOverflow社区ch reads:
@"<img\s*[^>]*>(?:\s*?</img>)?
Can someone please explain this part: (?:\s*?)?
What is that?
match but don't capture any number of whitespace followed by a close image tag, zero or one times:
(?: = match but don't capture
\s*? = any number of whitespace (not greedy)
</img> = close image tag
)? = zero or one times
:)
(?:\s*?)
selects any whitespace, if it exists, after the image tag. The ?:
at the beginning tells the regex engine to not capture that group (meaning it won't be returned in the matches array)
non-capturing group of any number of whitespace characters, followed by a closing img tag
The entire expression will capture any <img>
tags that have corresponding </img>
tags (but it won't capture the close tags). It doesn't capture the close tags because the (?:)
syntax means "match but don't capture".
Some restrictions that are part of this regex:
- The
\s*
in the opening tag is redundant because[^>]*
will capture this too - Only whitespace is allowed between the opening and closing tags
Some examples:
<img>
will not match<img></img>
will match, but only capture<img>
<img attr="123"></img>
will match, but only capture<img attr="123">
<imgabc></img>
will not match<img> </img>
will match, but only capture<img>
<img>ab</img>
will not match
I highly recommend the Regular Expression Designer available for free at www.radsoftware.com.au for testing regexs
精彩评论