wget return downloaded filename
I'm using wget in a php script and need to get the name of the file downloaded.
For example, if I try
<?php
system('/usr/bin/wget -q --directory-prefix="./downloads/" http://www.google.com/');
?>
I will get a file called index.html in the downloads directory.
EDIT: The page will not always be google tho开发者_如何学运维ugh, the target may be an image or stylesheet, so I need to find out the name of the file that was downloaded.
I'd like to have something like this:
<?php
//Does not work:
$filename = system('/usr/bin/wget -q --directory-prefix="./downloads/" http://www.google.com/');
//$filename should contain "index.html"
?>
Maybe that's some kind of cheating, but why not :
- decide yourself the name of the file that
wget
should create - indicate to
wget
that the download should be made to that file - when the download is finished, use that file -- as you already know the name.
Check out the -O
option of wget ;-)
For example, running this from the command-line :
wget 'http://www.google.com/' -O my-output-file.html
Will create a file called my-output-file.html
.
if your requirement is simple like just getting google.com
, then do it within PHP
$data=file_get_contents('http://www.google.com/');
file_put_contents($data,"./downloads/output.html");
On Linux like systems you can do:
system('/usr/bin/wget -q --directory-prefix="./downloads/" http://www.google.com/');
$filename = system('ls -tr ./downloads'); // $filename is now index.html
This works if there is no other process creating file in the ./downloads
directory.
I ended up using php to find the most recently updated file in the directory using the following code:
<?php
system('/usr/bin/wget -q --directory-prefix="./downloads/" http://www.google.com/');
$dir = "./downloads";
$newstamp = 0;
$newname = "";
$dc = opendir($dir);
while ($fn = readdir($dc)) {
# Eliminate current directory, parent directory
if (ereg('^\.{1,2}$',$fn)) continue;
$timedat = filemtime("$dir/$fn");
if ($timedat > $newstamp) {
$newstamp = $timedat;
$newname = $fn;
}
}
// $newname contains the name of the most recently updated file
// $newstamp contains the time of the update to $newname
?>
精彩评论