开发者

How to I preg_match_all starts with "http" and ends with (") or (') or white space(tabs, space, line break)

How do I write in regex that preg_match_all starts with "http"(without quotes) and ends with (") or (') or white space(tabs, space, line break)

I want to preg_match_all all the parts just starting with "http"

Wupload

http://www.wupload.com/file/CCCCCCC/NNIW-LiBRARY.part1.rar

http://www.wupload.com/file/VVVVVVVV/NNIW-LiBRARY.part2.rar

http://www.wupload.com/file/TTTTTTT/NNIW-LiBRARY.part3.rar

Fileserve

http://www.fileserve.com/file/WWWW/NNIW-LiBRARY.part1.rar

http://www.fileserve.com/file/TTTTT/NNIW-LiBRARY.part2.rar

http://www.fileserve.com/file/RRRRR/NNIW-LiBRARY.part3.rar

Uploaded.To

http://ul.to/AAAA/NNIW-LiBRARY.part1.rar

http://ul.to/BBBBB/NNIW-LiBRARY.part2.rar

http://ul.to/YYYYYY/NNIW-LiBRARY.part3.rar

Results must be like this

http://www.wupload.com/file/CCCCCCC/NNIW-LiBRARY.part1.rar

http://www.wupload.com/file/VVVVVVVV/NNIW-LiBRARY.part2.rar

http://www.wupload.com/file/TTTTTTT/NNIW-LiBRARY.part3.rar

http://www.fileserve.com/file/WWWW/NNIW-LiBRARY.part1.rar

http://www.fileserve.com/file/TTTTT/NNIW-LiBRARY.part2.rar

http://www.fileserve.com/file/RRRRR/NNIW-LiBRARY.part3.rar

http:/开发者_C百科/ul.to/AAAA/NNIW-LiBRARY.part1.rar

http://ul.to/BBBBB/NNIW-LiBRARY.part2.rar

http://ul.to/YYYYYY/NNIW-LiBRARY.part3.rar


i suggest you use parse_url to fetch parts of urls! Take a look at php.net

EDIT :

$file = file_get_contents( YOUR FILE NAME );
$lines = explode("\r\n", $file);
foreach( $lines as $line ){
$urlParts = parse_url( $line );
if( $urlParts['scheme'] == 'http' ){
 // Do anything ...
}
}

CHANGE :

oOk, i don't know what's your code!if you want to scrape html to find links i suggest this to you, it return href values of a tag to you :

preg_match_all ( "/<[ ]{0,}a[ \n\r][^<>]{0,}(?<= |\n|\r)(?:href)[ \n\r]{0,}=[ \n\r]{0,}[\"|']{0,1}([^\"'>< ]{0,})[^<>]{0,}>((?:(?!<[ \n\r]*\/a[ \n\r]*>).)*)<[ \n\r]*\/a[ \n\r]*>/ is", $source, $regs );

for ( $x = 0; $x < count ( $regs [ 1 ] ); $x ++ ) {
$tmp_array [ "link_raw" ] = trim ( $regs [ 1 ] [ $x ] );
}

Then use parse_url to check thoes


Do you mean you would like to remove the "Wupload", "Fileserve" and "Uploaded.To" titles and capture just the URLs in an array? If so, try the following:

preg_match_all('!^http://.*\n!m', $string, $matches);
echo "<pre>" . print_r($matches, 1) . "</pre>";


This should do what you need:

<?php
$matches = array();
preg_match_all('@https?://([-\w\.]+)+(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)?@', $string, $matches);
foreach ($matches[0] as $match) {
    // Do your processing here.
}
?>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜