开发者

Convert text link to HTML with context considered

I want to co开发者_StackOverflow中文版nvert links such as http://google.com/ to HTML, however if they're already in an HTML link, either in the href="" or in the text for the link, I don't want to convert them.

I found this in another question:

preg_replace('@(https?:\/\/([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="$1" target="_blank">$1</a>', $text);

However if I have something such as:

<a href="http://google.com/">http://google.com/</a>

already in the target text in question, it will create two links within that HTML. I can't seem to figure out the pattern for knowing if it's before /a or inside " ".


Do not use regular expressions for (X)HTML parsing. Use DOM instead! The XPath //text()[not(ancestor::a) and contains(., 'http://')][1] should find the first text node containing at least one HTTP URL that is not itself contained in an anchor tag. You may naively replace the text node with a text node containing preceding text, an anchor element node containing href attribute and href text node, and a text node containing remaining text. Do that until you find no more text nodes matching the XPath.


Based on mario's comment to my original post:

preg_replace('@(?<!href="|src="|">)(https?:\/\/([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="$1">$1</a>', $text);

Works perfectly for replacing bbpress's unknown pasta salad.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜