What is the process to compile Nutch into one Jar file (and run it)?
I'm trying to run the Nutch crawler in a way that I can access all its functionality through one JAR file that contains all its dependencies.
For instance,
java -jar nutch-all-1.2.jar -crawl <other params>
and at a later stage, call it with hadoop.
Currently, doing a
java -jar nutch-1.2.jar
on the JAR file that exists in the nutch directory results in the error,
Failed to load Main-Class manifest attribute from
nutch-1.2.jar
开发者_Go百科
I believe this happens because this particular JAR does not contain the manifest XML files, or other dependent JARs. What would you recommend as the best method to build nutch into one JAR for this purpose?
Thanks!
I realized after much looking around that to run Nutch off the command line in a simple manner, the nutch.job file can be used instead. The syntax is,
hadoop jar nutch-1.0.job org.apache.nutch.crawl.Crawl urls -dir crawl -depth 1
精彩评论