开发者

How can wget save only certains file types linked to from pages linked to by the target page?

How can wget save only certain file types linked to from pages linked to by the target page, regardless of the domain in which the certain files are?

Trying to speed up a task I have to do often.

I've been rooting through the wget docs and googling, but nothing seems to work. I keep on either getting just the target page or the subpages without the files (even using -H), so I'm obviously doing badly at this.

So, essentially, example.com/index1/ contains links to example.com/subpage1/ and example.com/subpage2/, while the subpages contain links to example2.com/file.ext a开发者_开发百科nd example2.com/file2.ext, etc. However, example.com/index1.html may link to example.com/index2/ which has links to more subpages I don't want.

Can wget even do this, and if not then what do you suggest I use? Thanks.


Following command worked for me.

wget -r --accept "*.ext" --level 2 "example.com/index1/"

Need to do recursively so -r should be added.


Something like this should Work:

wget --accept "*.ext" --level 2 "example.com/index1/"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜