How to quickly check large number of URLs if they're still valid (or 404/otherwise missing)
Urls are already conveniently in a text file, one l开发者_运维问答ine per url, so it's probably trivial curl/wget/LWP one-liner. Anybody cares to share such?
With LWP you can do this (redirecting output to a file if you so wished)
linux-t77m$ cat urls
http://google.com
http://stackoverflow.com
http://yahoo.com
linux-t77m$ cat urls | while read i ;do echo -n $i" "; lwp-request $i -sd; done
http://google.com 200 OK
http://stackoverflow.com 200 OK
http://yahoo.com 200 OK
The quickest way (as in getting the results faster) would be to launch processes in parallel, of course.
精彩评论