Scrape page on download site to extract specific URLs
On a download site, I want to scrape all the URLs for the mirror sites. I am using PHP.
For example, on this page:
http://drive开发者_开发技巧rs.softpedia.com/progDownload/Gigabyte-GA-P55A-UD3-rev-10-Intel-SATA-RAID-Preinstall-Driver-9501037-Download-99091.html
I want to extract the following URLs:
http://drivers.softpedia.com/dyn-postdownload.php?p=99091&t=0&i=1
http://drivers.softpedia.com/dyn-postdownload.php?p=99091&t=0&i=2
Try with:
(http:\/\/drivers\.softpedia\.com\/dyn-postdownload\.php\?p=\d+&t=\d+&i=\d+)
It is unclear where you got the "t" and "i" parameters from the source url, it only contains the id (p). The below should do for retrieving that last group of digits.
%(\d+)\.html$%
精彩评论