regex question - if I knew how to ask it proprly i'd properly know the answer as well?
So basically my regex is not working as I expect & I don't know why.
I am working in a fairly regulated environment so this should not be too much of a problem - all the html tags are generated by a script & follow this pattern: only li
, p
and h(3-6)
tags are present. all text is between tags and there are no spaces between tags.
I 'need' to write something to surround the li
s with ul
t开发者_如何学运维ags. here is what i got:
preg_replace('#(<li>[^<p|<h]+</li>)(?!<li>)#', '<ul>$1</ul>', $html)
however it only matches the last li
pair in a set for some reason. Anyone can tell me why ... please?
[^<p|<h]
doesn't do what you expect. It matches a single character that is not any of the characters <p|h
. If your HTML really is as constrained as you say, and you cannot have an <li>
nested inside another <li>
, then the following should work:
preg_replace('#(<li>.*?</li>)+#', '<ul>$0</ul>', $html)
The sequence .*?
is just like .*
except the trailing ?
is the non-greedy modifier. By default .*
is greedy - it will consume as many characters as it can, then backtrack if the rest of the pattern doesn't match. The non-greedy modifier inverts this. It consumes as few characters as it can and advances if the rest of the pattern cannot match. As the rest of the pattern is simply </li>
, this effectively captures all text up to, but not including, the first sequence </li>
. This pattern is then nested inside a capture which is then repeated with +
, meaning it will match one or more sequences of <li>
tags.
精彩评论