Using Heritrix 1.14
Went through the post and your sug开发者_开发百科gested solution as in -- Which web crawler for extracting and parsing data from about a thousand of web sites
Have installed heritrix under /root/heritrix-1.14.4
Stuck at export HERITRIX_HOME=/PATH/TO/BUILT/HERITRIX .
The command runs silently, but cd $heritrix_home results in
-bash: cd: /root/heritrix-1.14.4/bin/heritrix: Not a directory.
Have googled unsuccessfully.
chmod u+x $heritrix_home/bin/heritrix results in chmod: cannot access `/root/heritrix-1.14.4/bin/heritrix/bin/heritrix': Not a directory
Your valuable guidance / pointers requested
It looks like HERITRIX_HOME is set to /root/heritrix-1.14.4/bin/heritrix where it should be set to the directory /root/heritrix-1.14.4
精彩评论