How to make greedy regex for heading text? [as against tailing text]
Consider the following text:
<bla><bla text><bla>
I want to get the regex to match exactly the middle <bla text>
. I tried \<.*?开发者_Python百科text.*?\>
but it is capturing the string right from the start, since it starts with '<'.
Thanks a lot.
What do you think about
\<[^>]*text[^>]*?\>
Just not capture any character using .
, capture anything but >
using [^>]*
before and after your "text".
See here on Regexr
This regex matches the central <bla text>
and captures it as the first match (brackets included):
(\<\w+? \w+?\>)
Explained, it matches:
- a
<
- then, any non-empty sequence of word characters (
\w
is the shorthand for[a-zA-z_]
) - then exactly one space
- then another "greedy" sequence of word chars
- a final
>
That is, it matches exactly two words separated by exactly one space, all enclosed in <..>
.
This one:
(\<\w+?\s+\w+?\>)
also matches any number of spaces between the two words.
Finally, this one:
<\w+?>(\<\w+? \w+\>)<\w+?>
matches all the string, but captures the content of the central block, so that, if you want to replace the <bla><bla text><bla>
string, you may refer to the central block using $1
or \1
in your replacement string.
Here is your regexp.
/>(<bla.*?>)/
精彩评论