Tagged input to Stanford parser
Can anyone please tell me how I can feed the Stanford Parser my own tagged input sentence? The tagged sentence is, say,
A/NN quick/JJ brown/JJ fox/NN
.
From their documentations, I found that the flag -tagSeparator /
should work but I am pretty开发者_如何学Go lost here that I don't know how to use this flag in my program. Or any other way?
Please help.
Within the API you have to tokenize the words and tags yourself, and then feed words with tags to the parse method. See the Javadoc documentation of the parse method:
public boolean parse(List<? extends HasWord> sentence)
You pass it a list of tokens, which could be Word, TaggedWord or CoreLabel objects. If those objects implement HasTag, then any tag they store will be extracted and used. E.g., the following will work:
String[] words = { "This", "is", "an", "easy", "sentence", "." };
String[] tags = { "DT", "VBZ", "DT", "JJ", "NNP", "." };
List<TaggedWord> sentence = new ArrayList<TaggedWord>();
assert words.length == tags.length;
for (int i = 0; i < words.length; i++) {
sentence.add(new TaggedWord(words[i], tags[i]));
}
Tree parse = lp.apply(sentence);
If you look at the output parse tree, "sentence" will be (wrongly) tagged "NNP" since that's what was asked for....
精彩评论