开发者

Wget with input-file and output-document

I have a list of URLs which I would like to feed into wget using --input-file.

However I can't work开发者_开发问答 out how to control the --output-document value at the same time, which is simple if you issue the commands one by one. I would like to save each document as the MD5 of its URL.

 cat url-list.txt | xargs -P 4 wget

And xargs is there because I also want to make use of the max-procs features for parallel downloads.


Don't use cat. You can have xargs read from a file. From the man page:

       --arg-file=file
       -a file
              Read items from file instead of standard input.  If you use this
              option, stdin remains unchanged when commands are  run.   Other‐
              wise, stdin is redirected from /dev/null.


how about using a loop?

while read -r line
do
   md5=$(echo "$line"|md5sum)
   wget ... $line ... --output-document $md5 ......
done < url-list.txt


In your question you use -P 4 which suggests you want your solution to run in parallel. GNU Parallel http://www.gnu.org/software/parallel/ may help you:

cat url-list.txt | parallel 'wget {} --output-document "`echo {}|md5sum`"'


You can do that like this :

cat url-list.txt | while read url; do wget $url -O $( echo "$url" | md5 ); done

good luck

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜