开发者

PHP Regex dot matches new line alternative

I am come up with a regex to grab all text between 2 HTML开发者_JAVA百科 tags. This is what I have so far:

<TAG[^>]*>(.*?)</TAG>

In practice, this should work perfectly. But executing it in PHP preg_replace with options: /ims results in the WHOLE string getting matched.

If I remove the /s tag, it works perfectly but the tags have newlines between them. Is there a better way on approaching this?


Of course there's a better way. Don't parse HTML with regex.

DOMDocument should be able to accommodate you better:

$dom = new DOMDocument();
$dom->loadHTMLFile('filename.html');

$tags = $dom->getElementsByTagName('tag');

echo $tags[0]->textContent; // Contents of `tag`

You may have to tweak the above code (hasn't been tested).


I don't recommend use regex to match in full HTML, but, you can use the "dottal" flag: /REGEXP/s

Example:

$str = "<tag>
fvox
</tag>";

preg_match_all('/<TAG[^>]*>(.*?)</TAG>/is', $str, $r);
print_r($r); //dump
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜