Parse links, except for links inside a src=""
I got the following code which replaces URL by the corresponding links:
$in = array
(
'/(?:^|\b)((((http|https|ftp):\/\/)|(www\.))([\w\.]+)([,:%#&\/?=\w+\.-]+))(?:\b|$)/is'
);
$out = array
(
"<a href=\"$1\" target=\"_blank\">$1</a>"
);
return preg_replace($in, $out, $url);
However, I do not wish that URLS inside a SRC="url" atribute are converted into links.
How can I exclude 开发者_StackOverflow中文版URL enclosed inside an attribute from this pattern?
UPDATE: input would be:
Bellow you can see http://www.yahoo.com bla bla
<iframe src="http://yahoo.com"></frame
It need o parse the first link but not the URL inside the src=""
Use this php code to extract links except for src=""
<?php
$p = '/((<)(?(2).*?src=[^>]*>).*?)*?((?:(?:(?:http|https|ftp):\/\/)|(?:www\.))(?:[\w\.]+)(?:[,:%#&\/?=\w+\.-]+))/smi';
// multi-line input text
$str = 'Visit http://www.google.com bla bla <iframe src="http://apple.com">
</frame> Bellow you can see http://www.ibm.com bla bla';
preg_match_all($p, $str, $m);
var_dump( $m[3] );
?>
OUTPUT:
array(2) {
[0]=>
string(21) "http://www.google.com"
[1]=>
string(18) "http://www.ibm.com"
}
SUGGESTION:
Rather than making an exception for src=""
for extracting links I think it would be better to exclude all the links enclosed in <
and >
by using following regex:
$p = '/((<)(?(2)[^>]*>)(?:.*?))*?((?:(?:http|https|ftp):\/\/|www\.).*?[,:%#&\/?=\w+\.-]+)/smi';
精彩评论