stanford parse bash script error - linux bash
Can someone help me check my bash script? i'm trying to feed a directory of .txt files to the stanford parser (http://nlp.stanford.edu/software/pos-tagger-faq.shtml) but i can't get it to work. i'm working on ubuntu 10.10
开发者_StackOverflow社区the loop is working and reading the right files with:
#!/bin/bash -x
cd $HOME/path/to
for file in 'dir -d *'
do
# $HOME/chinesesegmenter-2006-05-11/segment.sh ctb $file UTF-8
echo $file
done
but with
#!/bin/bash -x
cd $HOME/yoursing/sentseg_zh
for file in 'dir -d *'
do
# echo $file
$HOME/chinesesegmenter-2006-05-11/segment.sh ctb $file UTF-8
done
i'm getting this error:
alvas@ikoma:~/chinesesegmenter-2006-05-11$ bash segchi.sh
Standard: CTB
File: dir
Encoding: -d
-------------------------------
Exception in thread "main" java.lang.NoClassDefFoundError: edu/stanford/nlp/ie/crf/CRFClassifier
Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.ie.crf.CRFClassifier
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: edu.stanford.nlp.ie.crf.CRFClassifier. Program will exit.
the following command works:
~/chinesesegmenter-2006-05-11/segment.sh ctb ~/path/to/input.txt UTF-8
and output this
alvas@ikoma:~/chinesesegmenter-2006-05-11$ ./segment.sh ctb ~/path/to/input.txt UTF-8
Standard: CTB
File: /home/alvas/path/to/input.txt
Encoding: UTF-8
-------------------------------
Loading classifier from data/ctb.gz...done [1.5 sec].
Using ChineseSegmenterFeatureFactory
Reading data using CTBSegDocumentReader
Sequence tagging 7 documents
如果 您 在 新加坡 只 能 前往 一 间 俱乐部 , 祖卡 酒吧 必然 是 您 的 不二 选择 。
作为 或许 是 新加坡 唯一 一 家 国际 知名 的 夜店 , 祖卡 既 是 一 个 公共 机构 , 也 是 狮城 年轻人 选择 进行 成人 礼等 庆祝 的 不二场所 。
As well as the :
(colon), which should be a ;
or a new line, the 'dir -d *'
doesn't do what you think it does - the loop will just have one iteration, where file
is a long string beginning with dir -d
and with all your files afterwards. Also, you initially change to a path based on $file
but then reuse the variable file
in your loop, which is suspect. I'm having to guess somewhat about your intent, but it can be much simpler, e.g.:
#!/bin/bash
cd ~/path/to/whereever
for file in *
do
~/chinesesegmenter-2006-05-11/segment.sh ctb "$file" UTF-8
done
Even if you used the (more correct) version with backticks:
for file in `dir -d *`
... it would still qualify for a Useless Use of ls * Award ;)
Update: originally I forgot to quote $file
, as pointed out in another answer
You could try:
for file in *
do
$HOME/segment.sh ctb "$file" UTF-8
done
So there were a couple of things to correct:
- Don't use
:
after the for statement, use;
or a newline - Put quotation marks around the
"$file"
object to allow whitespaces in file name - If you want to use a command where you put
'dir -d *'
you should use$(dir -d *)
or angle quation marks instead ``
for file in 'dir -d *': do
You've put a colon instead of a semicolon.
If you want an easy debugging, you can add -x
as an option to your shebang :
#!/bin/bash -x
The errors will be easier to spot.
精彩评论