Bash Script Command Issue
I when I type the following command into cygwin:
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*
then the binary works fine. When I place the exact same line into my bash script:
#!/bin/bash/
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*
I get an error saying some files don't exist. This may be specific to Nutch which is the program I'm running, but I think it has more to do with how I'm calling the command in the script. Any ideas about what's wrong and how to fix this? (yes I'm using tab completion)
EDIT:
Script:
#!/bin/bash
/home/Dan/apache-nutch-1.2/bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb crawl/segments/*
I run the command:
$ pwd
/home/Dan/apache-nutch-1.2
$ ./nutch.sh
The output I'm getting is:
Indexer: starting at 2010-11-29 15:15:44
Indexer: org.apache.hadoop.mapred.InvalidInputException: 开发者_如何学编程Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_fetch
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_parse
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_data
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_text
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
at org.apache.nutch.indexer.Indexer.index(Indexer.java:76)
at org.apache.nutch.indexer.Indexer.run(Indexer.java:97)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.indexer.Indexer.main(Indexer.java:106)
Regards, ~DS
Two things:
- You've got a trailing slash after "bash" in the shebang at the start of the script -- remove it, it should just read
#!/bin/bash
. Also double check there is abash
in/bin
. - The script will try and execute nutch from the
bin
directory in your currect folder. So if you're in$HOME
, and assuming you've got a path$HOME/bin/nutch
, then you'll be okay. But then if you change to/tmp
, then it'll fail as there's no such path as/tmp/bin/nutch
. You're better off giving the full absolute path name to nutch in the first place.
精彩评论