Regex for url formatting (www.domain.tld to anchors)
I'm currently developing a little browser-based Twitter widget.
Currently, I'm stuck with getting the URLs to work. I'm kinda newbie, when it comes to regex 开发者_如何转开发(I know, how to get parts of a string, but this one – tough one).
So, I need a regex that would search/replace
www.domain.tld -> <a href="http://www.domain.tld">http://www.domain.tld</a>
With/without http://, preferably.
Any advice is welcome. Thanks.
This is how far I've got:
www\.(?:\S*)\.(?:\S{2,3})
It checks for www. at beginning, any non-witespace chars and top level domain (2 or three chars).
I'm in an ever going war against RegExes, I don't like them. So, do I'd do it like this instead:
function get_domain_from_anchor($anchor, $delimiter = '"') {
return substr(strstr(strstr($anchor, $delimiter), $delimiter.'>', true), 8);
}
echo get_domain_from_anchor('<a href="http://www.domain.net">http://www.domain.net</a>');
// OUTPUTS: www.domain.net
Much better :D
I believe this is exactly what you're looking for: PHP validation/regex for URL
Some more information regarding extraction of URLs: Extract URLs from text in PHP
Try twitter-text-php. It is ported to PHP from the official Twitter code.
From the README file:
$autolinker = new Twitter_Autolink();
$html = $autolinker->autolink("Tweet mentioning @mikenz and refuring to his list @mikeNZ/sports and website http://mikenz.geek.nz");
echo $html;
精彩评论