How to I preg_match_all starts with "http" and ends with (") or (') or white space(tabs, space, line break)
How do I write in regex that preg_match_all starts with "http"(without quotes) and ends with (") or (') or white space(tabs, space, line break)
I want to preg_match_all all the parts just starting with "http"
Wuploadhttp://www.wupload.com/file/CCCCCCC/NNIW-LiBRARY.part1.rarhttp://www.wupload.com/file/VVVVVVVV/NNIW-LiBRARY.part2.rarhttp://www.wupload.com/file/TTTTTTT/NNIW-LiBRARY.part3.rarFileservehttp://www.fileserve.com/file/WWWW/NNIW-LiBRARY.part1.rarhttp://www.fileserve.com/file/TTTTT/NNIW-LiBRARY.part2.rarhttp://www.fileserve.com/file/RRRRR/NNIW-LiBRARY.part3.rarUploaded.Tohttp://ul.to/AAAA/NNIW-LiBRARY.part1.rarhttp://ul.to/BBBBB/NNIW-LiBRARY.part2.rarhttp://ul.to/YYYYYY/NNIW-LiBRARY.part3.rar
Results must be like this
http://www.wupload.com/file/CCCCCCC/NNIW-LiBRARY.part1.rar http://www.wupload.com/file/VVVVVVVV/NNIW-LiBRARY.part2.rar http://www.wupload.com/file/TTTTTTT/NNIW-LiBRARY.part3.rar http://www.fileserve.com/file/WWWW/NNIW-LiBRARY.part1.rar http://www.fileserve.com/file/TTTTT/NNIW-LiBRARY.part2.rar http://www.fileserve.com/file/RRRRR/NNIW-LiBRARY.part3.rar http:/开发者_C百科/ul.to/AAAA/NNIW-LiBRARY.part1.rar http://ul.to/BBBBB/NNIW-LiBRARY.part2.rar http://ul.to/YYYYYY/NNIW-LiBRARY.part3.rari suggest you use parse_url to fetch parts of urls! Take a look at php.net
EDIT :
$file = file_get_contents( YOUR FILE NAME );
$lines = explode("\r\n", $file);
foreach( $lines as $line ){
$urlParts = parse_url( $line );
if( $urlParts['scheme'] == 'http' ){
// Do anything ...
}
}
CHANGE :
oOk, i don't know what's your code!if you want to scrape html to find links i suggest this to you, it return href values of a tag to you :
preg_match_all ( "/<[ ]{0,}a[ \n\r][^<>]{0,}(?<= |\n|\r)(?:href)[ \n\r]{0,}=[ \n\r]{0,}[\"|']{0,1}([^\"'>< ]{0,})[^<>]{0,}>((?:(?!<[ \n\r]*\/a[ \n\r]*>).)*)<[ \n\r]*\/a[ \n\r]*>/ is", $source, $regs );
for ( $x = 0; $x < count ( $regs [ 1 ] ); $x ++ ) {
$tmp_array [ "link_raw" ] = trim ( $regs [ 1 ] [ $x ] );
}
Then use parse_url to check thoes
Do you mean you would like to remove the "Wupload", "Fileserve" and "Uploaded.To" titles and capture just the URLs in an array? If so, try the following:
preg_match_all('!^http://.*\n!m', $string, $matches);
echo "<pre>" . print_r($matches, 1) . "</pre>";
This should do what you need:
<?php
$matches = array();
preg_match_all('@https?://([-\w\.]+)+(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)?@', $string, $matches);
foreach ($matches[0] as $match) {
// Do your processing here.
}
?>
精彩评论