Converting URLs to HTML hyperlinks while avoiding already-formatted hyperlinks
So I have this code:
$sURLRegExp = '/http\:\/\/([a-z0-9\-\.]+\.[a-z]{2,3}(\/\S*)?)/i';
$iURLMatches = preg_match($sURLRegExp, $sMessage, $aURLMatches);
if ($iURLMatches > 0) {
$sURL = $aURLMatches[1];
$sURL = str_replace('www.', '', $sURL);
$sMessage = preg_replace($sURLRegExp, '<a href="http://$1" target="_blank">' .
$sURL . '</a>', $sMessage);
}
It does a perfect job of c开发者_如何学Pythononverting all incoming messages so that plain URLs entered will turn into HTML hyperlinks that even remove the "http://" and "www." part, for brevity.
Thing is, administrators for the site on which this works are able to enter in HTML. If they do, it turns it into a horrid mess. Something like <a href="<a href="http://www.site.com">site.com</a>">text</a>.
I tried altering the regular expression to make sure that there is no quotation mark after the given URL (which most likely indicates it's part of a hyperlink anchor tag) like so:
$sURLRegExp = '/http\:\/\/([a-z0-9\-\.]+\.[a-z]{2,3}(\/\S*)?([^"])/i';
...but it doesn't seem to work. I know about look-ahead assertions, but have no idea how to use them at all. Would that be the best thing to use in this case? How would I detect the presence of an anchor tag around this URL?
Note: I know I could just use strpos(...) !== false on the entire message, but that doesn't account for mixes of plain URLs and anchor tags in the same message.
Hmm, turns out I hadn't searched Stack Overflow thoroughly enough. All I had to do was add (?<![">])
to the beginning of my regular expression, like so:
$sURLRegExp = '/(?<![">])http\:\/\/([a-z0-9\-\.]+\.[a-z]{2,3}(\/\S*)?)([^"])/i';
...and it works perfectly. I'm keeping this for future reference for anybody else who happens upon this post.
精彩评论