xpath: select text nodes before and after break tags
considering the following : (mixture of <br>
and <br/>
)
text1
<br>
text2
<br/>
text3
<br/>
text4
<br>
text5
How can I lo开发者_如何学Gocate each text nodes ?
I am thinking something that fits the condition of preceding OR following a br tag....but unsure if <br>
and <br/>
are treated differently in xpath.
DOMDocument's loadHtml() method works well with invalid HTML fragments, so you can use DOMXPath this way:
<?php
$html = 'text1
<br>
text2
<br/>
text3
<br/>
text4
<br>
text5';
echo "<pre>" . htmlentities($html) . "</pre><br>\n";
$dom = new DOMDocument();
// loadHtml() needs mb_convert_encoding() to work well with UTF-8 encoding
$dom->loadHtml(mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8"));
$xpath = new DOMXPath($dom);
echo "Text nodes preceding br:";
foreach($xpath->query('//text()[(following::br)]') as $node)
{
var_dump($node->wholeText);
}
echo "Text nodes following br:";
foreach($xpath->query('//text()[(preceding::br)]') as $node)
{
var_dump($node->wholeText);
}
echo "Text nodes following OR preceding br:";
foreach($xpath->query('//text()[(following::br) or (preceding::br)]') as $node)
{
var_dump($node->wholeText);
}
Your example is not valid XML against which an XPath query can be run - neither of the
elements are ever closed.
However, generally to select that you would use the node type predicate, something like //br/text()
精彩评论