What to use to check html links in large project, on Linux?
I have directory with > 1000 .html files开发者_StackOverflow中文版, and would like to check all of them for bad links - preferably using console. Any tool you can recommend for such task?
you can use wget
, eg
wget -r --spider -o output.log http://somedomain.com
at the bottom of the output.log file, it will indicate whether wget
has found broken links. you can parse that using awk/grep
I'd use checklink (a W3C project)
You can extract links from html files using Lynx text browser. Bash scripting around this should not be difficult.
Try the webgrep command line tools or, if you're comfortable with Perl, the HTML::TagReader module by the same author.
精彩评论