How to wget a file when the filename isn't known?
I am trying to automate the download of a file using wget and calling the php script from cron, the filename always consists of filename and date, however the date changes depending on when the file is uploaded. The trouble is there is no certainty开发者_运维百科 of when the file is updated, and hence the final name can never really be known until the directory is checked.
An example filename is file20100818.tbz
I tried using wildcards within wget but they have failed, both using * and %
Thanks in advance,
Greg
Assuming the file type is constant then from the wget
man page:
You want to download all the GIFs from a directory on an HTTP server. You tried wget http://www.server.com/dir/*.gif, but that didn't work because HTTP retrieval does not support globbing. In that case, use:
wget -r -l1 --no-parent -A.gif http://www.server.com/dir/
So, you want to use the -A
flag, something like:
wget -r -l1 --no-parent -A.tbz http://www.mysite.com/path/to/files/
For the sake of clarity, because this threads shows up in google search when searching "wget and wildcards" and because the answers above don't bring sensitive solution and there doesn't seem to be anything else on SO answering this:
According to the wget manual, you can use the wildcards when using ftp and using the option -g on (--glob=on)
, however, wget
will return an error unless you are using all the -r -np -nd
options. Thanks to Wiseman20@ubuntuforums for showing us the way.
Samplecode:
wget -r -np -nd --glob=on ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.*.tar.gz
You can for loop each date like this:
<?php
for($i=0;$i<30;$i++)
{
$filename = "file".date("Ymd", time() + 86400 * $i).".tbz";
//try file download, if successful, break out of loop.
?>
You can increase number of tries in for loop.
精彩评论