开发者

Parse links, except for links inside a src=""

I got the following code which replaces URL by the corresponding links:

$in = array
(
        '/(?:^|\b)((((http|https|ftp):\/\/)|(www\.))([\w\.]+)([,:%#&\/?=\w+\.-]+))(?:\b|$)/is'
);
$out = array
(
        "<a href=\"$1\" target=\"_blank\">$1</a>"
);
return preg_replace($in, $out, $url);

However, I do not wish that URLS inside a SRC="url" atribute are converted into links.

How can I exclude 开发者_StackOverflow中文版URL enclosed inside an attribute from this pattern?

UPDATE: input would be:

Bellow you can see http://www.yahoo.com bla bla
<iframe src="http://yahoo.com"></frame

It need o parse the first link but not the URL inside the src=""


Use this php code to extract links except for src=""

<?php
   $p = '/((<)(?(2).*?src=[^>]*>).*?)*?((?:(?:(?:http|https|ftp):\/\/)|(?:www\.))(?:[\w\.]+)(?:[,:%#&\/?=\w+\.-]+))/smi';

   // multi-line input text
   $str = 'Visit http://www.google.com bla bla <iframe src="http://apple.com">
           </frame> Bellow you can see http://www.ibm.com bla bla';

   preg_match_all($p, $str, $m);
   var_dump( $m[3] );
?>

OUTPUT:

array(2) {
  [0]=>
  string(21) "http://www.google.com"
  [1]=>
  string(18) "http://www.ibm.com"
}


SUGGESTION:

Rather than making an exception for src="" for extracting links I think it would be better to exclude all the links enclosed in < and > by using following regex:

$p = '/((<)(?(2)[^>]*>)(?:.*?))*?((?:(?:http|https|ftp):\/\/|www\.).*?[,:%#&\/?=\w+\.-]+)/smi';
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜