开发者

xpath: select text nodes before and after break tags

considering the following : (mixture of <br> and <br/>)

text1
<br>
text2
<br/>
text3
<br/>
text4
<br>
text5

How can I lo开发者_如何学Gocate each text nodes ?

I am thinking something that fits the condition of preceding OR following a br tag....but unsure if <br> and <br/> are treated differently in xpath.


DOMDocument's loadHtml() method works well with invalid HTML fragments, so you can use DOMXPath this way:

<?php

$html = 'text1
<br>
text2
<br/>
text3
<br/>
text4
<br>
text5';

echo "<pre>" . htmlentities($html) . "</pre><br>\n";

$dom = new DOMDocument();
// loadHtml() needs mb_convert_encoding() to work well with UTF-8 encoding
$dom->loadHtml(mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8"));

$xpath = new DOMXPath($dom);

echo "Text nodes preceding br:";
foreach($xpath->query('//text()[(following::br)]') as $node)
{
    var_dump($node->wholeText);
}

echo "Text nodes following br:";
foreach($xpath->query('//text()[(preceding::br)]') as $node)
{
    var_dump($node->wholeText);
}

echo "Text nodes following OR preceding br:";
foreach($xpath->query('//text()[(following::br) or (preceding::br)]') as $node)
{
    var_dump($node->wholeText);
}


Your example is not valid XML against which an XPath query can be run - neither of the
elements are ever closed.

However, generally to select that you would use the node type predicate, something like //br/text()

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜