Extract links from a string and putting them to an array and then parse them
I have a little regex script in PHP that made clickable all my links from a string that looks like
function clickable_link($text)
{
$text = preg_replace('#(script|about|applet|activex|chrome):#is', "\\1:", $text);
$ret = ' ' . $text;
$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"\\2\" target=\"_blank\" rel=\"nofollow\" id=\"LinkWordWarp\">\\2</a>", $ret);
$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"http://\\2\" target=\"_blank\" rel=\"nofollow\" id=\"LinkWordWarp\">\\2</a>", $ret);
$ret = preg_replace("#(^|[\n ])([a-z0-9&\-_.]+?)@([\w\-]+\.([\w\-\.]+\.)*[\w]+)#i", "\\1<a href=\"mailto:\\2@\\3\">\\2@\\3</a>", $ret);
return $ret;
}
and works fine, but i would like a small adjustment, like to check when its a YouTube link to not make him as
<a href=youtube>youtube</a>
but rather (if there is an youtube link) as
<iframe width="425" height="349" src="http://www.youtube.com/embed/youtube" frameborder="0" allowfullscreen></iframe>
and
<img src="link" />
if it is an image.
Any help would be appreciated.
I have wrote a little script for all of that, but its too SLOW!!!!!!!!!
<?php
function MakeContentInteractive($string)
{
$order = array("<br>", "<br/>", "<br />");
$replace = ' <br/> ';
$string = str_replace($order, $replace, $string);
$firstImageSetted = false;
$firstImage = "";
$allval = "";
$pieces = explode(" ", $string);
$regex = "^(((ht|f)tp(s?))\://)?(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]+\.(com|edu|gov|mil|net|org|biz|info|name|museum|us|ca|uk|co|tk)(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+))*$^"; // SCHEME
$i=0;
foreach($pieces as $val)
{
echo $val."<hr>";
$i++;
$url = $val;
$url = str_replace(" ", "+", $url);
$strlen = strlen($url);
$ext = substr($val,$strlen-4,$strlen);
$random = rand(1000000,9000000);
if(preg_match($regex, $url))
{
/*CHECK IF IS YOUTUBE*/
$pos = strpos($url,"youtube.com");
if ($pos !== false)
{
//retrive video from link
$videoLink = $val;
$videoLinkPharser = $videoLink;
$videoLinkPharser = substr($videoLinkPharser, 2, 42);
$vid = substr($videoLinkPharser, -11, 42);
//check if youtube link is valid
$youtubeId = $vid;
// Check if youtube video item exists by the existance of the the 200 response
$headers = get_headers('http://gdata.youtube.com/feeds/api/videos/' . $vid);
if (!strpos($headers[0], '200'))
{
$valid = 0;
}
else
{
$isYoutube = 1;
$valid = 1;
$code = '<div id="YoutubeLink"><iframe width="425" height="349" src="http://www.youtube.com/embed/'.$vid.'" frameborder="0" allowfullscreen></iframe></div>';
$allval = $allval.$code;
}
}
if(!$isYoutube == 1)
{
$url=trim($url);
/*CHECK IF IS PICTURE*/
$mime = getimagesize($url);
$mime = $mime['mime'];
if($mime == "image/gif" or $mime == "image/jpeg" or $mime == "image/png")
{
echo $url;
if(exif_imagetype($url) == IMAGETYPE_GIF and $ext == ".gif")
{
$isPicture = 1;
$filename =$random.basename($url);
$code = '<div id="CategoryPicture"><img src="'.$val.'" width="100" height="100" /><div>';
$allval = $allval.$code;
if($firstImageSetted == false)
{
$firstImage=$val;
$firstImageSetted = true;
}
}
if(exif_imagetype($url) == IMAGETYPE_JPEG and $ext == ".jpg")
{
$isPicture = 1;
$filename =$random.basename($url);
$code = '<div id="CategoryPicture"><img src="'.$val.'" width="100" height="100" /><div>';
$allval = $allval.$code;
if($firstImageSetted == false)
{
$firstImage=$val;
$firstImageSetted = true;
echo "JPG!";
}
}
if(exif_imagetype($url) == IMAGETYPE_PNG and $ext == ".png")
{
$isPicture = 1;
$filename =$random.basename($url);
$code = '<div id="CategoryPicture"><img src="'.$val.'" width="100" height="100" /><div>';
$allval = $allval.$code;
if($firstImageSetted == false)
{
$firstImage=$val;
$firstImageSetted = true;
}
}
}
}
/*IF not YOUTUBE or PICTURE then it's a link*/
if(!$isYoutube == 1 and !$isPicture == 1)
{
$text = preg_replace('#(script|about|applet|activex|chrome):#is', "\\1:", $url);
$ret = ' ' . $text;
$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"\\2\" target=\"_blank\" rel=\"nofollow\" id=\"LinkWordWarp\">\\2</a>", $ret);
$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1&l开发者_JAVA百科t;a href=\"http://\\2\" target=\"_blank\" rel=\"nofollow\" id=\"LinkWordWarp\">\\2</a>", $ret);
$ret = preg_replace("#(^|[\n ])([a-z0-9&\-_.]+?)@([\w\-]+\.([\w\-\.]+\.)*[\w]+)#i", "\\1<a href=\"mailto:\\2@\\3\">\\2@\\3</a>", $ret);
$code = '<a href="'.$url.'">'.$url.'</a>';
$allval = $allval.$ret;
}
$isYoutube = 0;
$isPicture = 0;
}
else
{
$allval = $allval.$val;
}
}
echo "and the first image is: ".$firstImage."<br/>";
return $allval;
}
?>
And the slow part is when checks the image with exif and getimage size ( 3 seconds per picture !!!) How can i solve that???
Maybe add
$ret = preg_replace("#http\://www.youtube.com/watch\?v=([a-z0-9-_])+(&feature=[a-z_]*)*#is",
'<iframe width="425" height="349" src="http://www.youtube.com/embed/\1" frameborder="0" allowfullscreen></iframe>');
for Youtube and
$ret = preg_replace("#https?\://[a-z0-9\-.]*/[^\s]+((\.jpg)|(\.jpeg)|(\.png)|(\.gif)|(\.bmp))#is",
'<img src="\0" />');
for images. But you better do all replacements with one call to avoid replacing already replaced links. preg_replace can take arrays as pattern and replacement args.
But you can't be sure if the URL links to an image until you get a server response from it. You can only suggest that if a link ends with ".jpg", ".jpeg", ".gif", ".bmp" then it may be image. But it can be something like "http://www.google.com/search?q=trollface.jpg" which ends with ".jpg" but is not an image. You can use CURL to check such links but this may be a productivity issue.
EDIT: OK, there's a problem with your updated code. The script is so slow because you send requests to other servers and the main part of delay is awating for their response. First, I think it's not necessary to check if video is present on youtube when you have link like http://www.youtube.com/watch?v=blahblah&feature=blah . You just can take the code blahblah and embed it. If there's no such video, then youtube will tell us about it, this is the problem of the person who posted that link. I think the preg_replace which I wrote is enough.
Second, you call image processing functions several times for the same URL. And each time the image must be downloaded from other server. You should request the server only once -- to download the image (or whatever will be the response) to temporary file and then pass it's path instead of URL to image functions.
精彩评论