开发者

How to get wanted nodes from Stanford Parser NLP?

My main problem is that I don't know how to extract nodes from GrammaticalStructure. I am using englishPCFG.ser in java netbeans. My target is to know the quality of the screen like:

The screen of iphone 4 is great.

I want to extract screen and great. How can I extract the NN (screen) and VP (great) ?

the code that I wrote is:

LexicalizedParser lp = new LexicalizedParser("C:\\englishPCFG.ser");
lp.setOptionF开发者_运维知识库lags(new String[]{"-maxLength", "80", "-retainTmpSubcategories"});

String sent ="the screen is very good.";
Tree parse = (Tree) lp.apply(Arrays.asList(sent));
parse.pennPrint();
System.out.println();

TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();


The collection tdl is a list of typed dependencies. For this sentence, it contains:

det(screen-2, the-1)
nsubj(great-7, screen-2)
amod(4-5, iphone-4)
prep_of(screen-2, 4-5)
cop(great-7, is-6)

(as you can see by trying it out online).

So, the dependency you want, nsubj(great-7, screen-2) is right there in that list. nsubj means that "screen" is the subject of "great".

The collection of dependencies is just a Collection (List). For doing more sophisticated further processing, people commonly want to make the dependencies into a graph structure that can be variously searched and traversed. There are various ways of doing that. We often use the (jgrapht)[http://www.jgrapht.org/] library. But that's then code you are writing yourself.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜