Snatch a daily comic and store it locally
开发者_C百科I want to copy the daily comic from www.explosm.net/comics and store in a local folder.
The comics have unique names, and stored at several locations depending on the creator.
Like this:
- www.explosm.net/db/files/Comics/Rob/comic1.png
- www.explosm.net/db/files/Comics/Dave/comic2.png
However, every daily comic is available through the same url, www.explosm.net/comics, which redirects you to the newest comic available. Not sure if this is of any use, though.
I've experienced a little with the wget command together with cron to achieve this, but my lack of knowledge didn't yield me any satisfactory results.
Thanks in advance.
You might want to look into cURL. What you want is a script which invokes cURL to obtain the page source served by the server when you request www.explosm.net/comics. Then you'd parse the returned data looking for the img
tag which displays the comic.
After you have the src
attribute of the img
tag in question, you can make another request using cURL to actually download the image and save the returned data to a file, locally.
It appears that the source of the actual comic image, the one you're looking for starts with http://www.explosm.net/db/files/Comics
so you can use a regular expression such as the following to determine the URL of the image you want to download.
src\=\"(http:\/\/www\.explosm\.net\/db\/files\/Comics\/[^"]*)\"
The URL is going to be the first group in the matches.
精彩评论