开发者

remove links with empty insides, add a redirect link to sites that not in allowed list

<?php
$allowedURLHosts = array(
'youtube.com',
'google.com'
);

$blockedURLHosts = array(
'yahoo.com',
'208.71.34.142'
);


function getDomain($url)
{
    $pieces = parse_url($url);
    $domain = isset($pieces['host']) ? $pieces['host'] : '';
    if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs))
    {
        return $regs['domain'];
    }
    re开发者_StackOverflow社区turn '';
}

function filterLinks($str)
{
    $dom = new DOMDocument;
    @$dom->loadHTML($str);

    global $blockedURLHosts;
    global $allowedURLHosts;

    // Get all links in the document.
    $links = $dom->getElementsByTagName('a');
    $linksLength = $links->length;

    // Iterate over all links.
    while ($linksLength--)
    {
        $link = $links->item($linksLength);
        if ($link->hasAttribute('href'))
        {
            // Get the href attribute of the link.
            $src = $link->getAttribute('href');
            if ($link->nodeValue != '')
            {
                if (in_array(getDomain($src), $blockedURLHosts))
                {
                    $link->parentNode->removeChild($link);
                }
                else if (!in_array(getDomain($src), $allowedURLHosts))
                {
                    $newlink = $dom->createElement('a');

                    if ($link->hasAttribute('title'))
                    {
                        $newlink->setAttribute('title', $link->getAttribute('title'));
                    }
                    $newlink->setAttribute('href', '/redirect?link=' . urlencode($src));

                    $newlink->nodeValue = $link->nodeValue;

                    $link->parentNode->replaceChild($newlink,$link);
                }
            }
            else
            {
                $link->parentNode->removeChild($link);
            }
        }
    }
    $html = '';
    foreach($dom->getElementsByTagName('body')->item(0)->childNodes as $node)
    {
        $html .= $dom->saveXML($node, LIBXML_NOEMPTYTAG);
    }
    return $html;
}

$text = '<div class="InfoText"><a href="http://www.getmyspacecomments.com/"><img src="http://ohiok.com/img/k75/laserxpc/a1/04.jpg" title="MySpace Comment Codes" alt="image" style="border: 0px;"></a><br><a href="http://www.getmyspacecomments.com/"><span style="font-size: large;">MySpace Comments</span></a><br></div>

<div style="text-align: center;"><a href="http://www.glitterbell.com/" title="MySpace Comments"></a><br><a href="http://www.glitterbell.com/">MySpace Comments at GlitterBell.com</a><br><a href="http://www.myspace.com/469121002">Add the Comment App</a></div>';

echo filterLinks($text);
?>

The one link with the img inside the a open and close is not showing up. The links don't try to load a page when i click them for some reason. I'm still not done with this script fully, but so far its not working. I'm not really sure what i'm doing wrong so far.

<a href="http://www.glitterbell.com/" title="MySpace Comments"></a> should be removed because the text inside of the is empty.

Code Output:

<div class="InfoText"><br></br><a href="/redirect?link=http%3A%2F%2Fwww.getmyspacecomments.com%2F">MySpace Comments</a><br></br></div> 

<div style="text-align: center;"><br></br><a href="/redirect?link=http%3A%2F%2Fwww.glitterbell.com%2F">MySpace Comments at GlitterBell.com</a><br></br><a href="/redirect?link=http%3A%2F%2Fwww.myspace.com%2F469121002">Add the Comment App</a></div>


I would say the img element isn't showing up because you are working the nodeValue property, which does not serialise HTML and should not be used for handling HTML.

If you wanted to assign the exact same children, assign childNodes property.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜