开发者

xPath insert before and after - With DOM and PHP

I need to add a class to a HTML structure.

My class is called "container" and should start right after <div><ul><li></h4> (the child of ul and its simblings, not grandchilds) and should end right before the closing of the same element.

My whole code looks like this:

<?php
$content = '
    <div class="sidebar-1">
        <ul>
            <li>
                <h4>Title</h4>
                <ul> 
                    <li><a href="http://www.test.com">Test</a></li> 
                    <li><a href="http://www.test.com">Test</a></li> 
                </ul> 
            </li> 
            <li>
                <p>Paragraf</p>
            </li> 
            <li>
                <h4>New title</h4>
                <ul> 
                    <li>Some text</li>
                    <li>Some text åäö</li>
                </ul> 
            </li> 
        </ul>
    </div>
';

$doc = new DOMDocument();
$doc->loadHTML($content);
$x = new DOMXPath($doc);

$start_text = '<div class="container">';
$end_text = '</div>';

foreach($x->query('//div/ul/li') as $anchor)
{
    $anchor->insertBefore(new DOMText($start_text),$anchor->firstChild);
}
echo $doc->saveXML($doc->getElementsByTagName('ul')->item(0));
?>

It works as far as i can add the class opening but not the closing element. I also get strange encoding doing this. I want the output to be the same encoding as the input.

The result should be

    <div class="sidebar-1">
        <ul>
            <li>
                <h4>Title</h4>
                <div class="content">
                    <ul> 
                        <li><a href="http://www.test.com">Test</a></li> 
                        <li><a href="http://www.test.com">Test</a></li> 
                    </ul>
                </div>
            </li> 
            <li>
                <div class="content">
                    <p>Paragraf</p>
                </div>
            </li> 
            <li>
                <h4>New title</h4>
                <div class="conten开发者_开发技巧t">
                    <ul> 
                        <li>Some text</li>
                        <li>Some text åäö</li>
                    </ul> 
                </div>
            </li> 
        </ul>
    </div>


I couldn't find a more elegant way to reassign all children, so I guess this will do. I think it gets what you're after, though.

(NOTE: Code updated to reflect additional requirements in the comments.)

$doc = new DOMDocument();
$doc->loadHTML($content);
$x = new DOMXPath($doc);

foreach($x->query('//div/ul/li') as $anchor)
{
    $container = $doc->importNode(new DOMElement('div'));
    $container->setAttribute('class', 'container');

    $next = $anchor->firstChild;
    while ($next !== NULL) {
        $curr = $next;
        $next = $curr->nextSibling;

        if (($curr->nodeName != 'h4')
            || ($curr->attributes === NULL)
            || ($curr->attributes->getNamedItem('class') === NULL)
            || !preg_match('#(^| )title( |$)#', $curr->attributes->getNamedItem('class')->nodeValue)
        ) {
            $container->appendChild($anchor->removeChild($curr));
        }
    }

    $anchor->appendChild($container);
}

As for character encoding, I've been messing with it for a while and it's a tricky issue. The characters display correctly when you load with loadXML() but not with loadHTML(). There's a workaround in the comments, but it ain't pretty. Hopefully some of the user comments will help you can find a usable solution.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜