开发者

Converting URLs to HTML hyperlinks while avoiding already-formatted hyperlinks

So I have this code:

$sURLRegExp = '/http\:\/\/([a-z0-9\-\.]+\.[a-z]{2,3}(\/\S*)?)/i';
$iURLMatches = preg_match($sURLRegExp, $sMessage, $aURLMatches);
if ($iURLMatches > 0) {
    $sURL = $aURLMatches[1];
    $sURL = str_replace('www.', '', $sURL);
    $sMessage = preg_replace($sURLRegExp, '<a href="http://$1" target="_blank">' . 
        $sURL . '</a>', $sMessage);
}

It does a perfect job of c开发者_如何学Pythononverting all incoming messages so that plain URLs entered will turn into HTML hyperlinks that even remove the "http://" and "www." part, for brevity.

Thing is, administrators for the site on which this works are able to enter in HTML. If they do, it turns it into a horrid mess. Something like <a href="<a href="http://www.site.com">site.com</a>">text</a>.

I tried altering the regular expression to make sure that there is no quotation mark after the given URL (which most likely indicates it's part of a hyperlink anchor tag) like so:

$sURLRegExp = '/http\:\/\/([a-z0-9\-\.]+\.[a-z]{2,3}(\/\S*)?([^"])/i';

...but it doesn't seem to work. I know about look-ahead assertions, but have no idea how to use them at all. Would that be the best thing to use in this case? How would I detect the presence of an anchor tag around this URL?

Note: I know I could just use strpos(...) !== false on the entire message, but that doesn't account for mixes of plain URLs and anchor tags in the same message.


Hmm, turns out I hadn't searched Stack Overflow thoroughly enough. All I had to do was add (?<![">]) to the beginning of my regular expression, like so:

$sURLRegExp = '/(?<![">])http\:\/\/([a-z0-9\-\.]+\.[a-z]{2,3}(\/\S*)?)([^"])/i';

...and it works perfectly. I'm keeping this for future reference for anybody else who happens upon this post.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜