xPath insert before and after - With DOM and PHP
I need to add a class to a HTML structure.
My class is called "container" and should start right after <div><ul><li></h4> (the child of ul and its simblings, not grandchilds) and should end right before the closing of the same element.
My whole code looks like this:
<?php
$content = '
<div class="sidebar-1">
<ul>
<li>
<h4>Title</h4>
<ul>
<li><a href="http://www.test.com">Test</a></li>
<li><a href="http://www.test.com">Test</a></li>
</ul>
</li>
<li>
<p>Paragraf</p>
</li>
<li>
<h4>New title</h4>
<ul>
<li>Some text</li>
<li>Some text åäö</li>
</ul>
</li>
</ul>
</div>
';
$doc = new DOMDocument();
$doc->loadHTML($content);
$x = new DOMXPath($doc);
$start_text = '<div class="container">';
$end_text = '</div>';
foreach($x->query('//div/ul/li') as $anchor)
{
$anchor->insertBefore(new DOMText($start_text),$anchor->firstChild);
}
echo $doc->saveXML($doc->getElementsByTagName('ul')->item(0));
?>
It works as far as i can add the class opening but not the closing element. I also get strange encoding doing this. I want the output to be the same encoding as the input.
The result should be
<div class="sidebar-1">
<ul>
<li>
<h4>Title</h4>
<div class="content">
<ul>
<li><a href="http://www.test.com">Test</a></li>
<li><a href="http://www.test.com">Test</a></li>
</ul>
</div>
</li>
<li>
<div class="content">
<p>Paragraf</p>
</div>
</li>
<li>
<h4>New title</h4>
<div class="conten开发者_开发技巧t">
<ul>
<li>Some text</li>
<li>Some text åäö</li>
</ul>
</div>
</li>
</ul>
</div>
I couldn't find a more elegant way to reassign all children, so I guess this will do. I think it gets what you're after, though.
(NOTE: Code updated to reflect additional requirements in the comments.)
$doc = new DOMDocument();
$doc->loadHTML($content);
$x = new DOMXPath($doc);
foreach($x->query('//div/ul/li') as $anchor)
{
$container = $doc->importNode(new DOMElement('div'));
$container->setAttribute('class', 'container');
$next = $anchor->firstChild;
while ($next !== NULL) {
$curr = $next;
$next = $curr->nextSibling;
if (($curr->nodeName != 'h4')
|| ($curr->attributes === NULL)
|| ($curr->attributes->getNamedItem('class') === NULL)
|| !preg_match('#(^| )title( |$)#', $curr->attributes->getNamedItem('class')->nodeValue)
) {
$container->appendChild($anchor->removeChild($curr));
}
}
$anchor->appendChild($container);
}
As for character encoding, I've been messing with it for a while and it's a tricky issue. The characters display correctly when you load with loadXML()
but not with loadHTML()
. There's a workaround in the comments, but it ain't pretty. Hopefully some of the user comments will help you can find a usable solution.
精彩评论