开发者

Stanford NER - Extract Multi word entities

How can I tag collocations in Stanford NER? Currently it tags Federal Reserve Bank of New York as

<wi num="11" entity="ORGANIZATION">Federal</wi> <wi num="12" entity="ORGANIZATION">Reserve</wi> <wi num="13" entity="ORGANIZATION">Bank</wi> <wi num="14" entity="ORGANIZATION">of</wi> <w开发者_高级运维i num="15" entity="ORGANIZATION">New</wi> <wi num="16" entity="ORGANIZATION">York</wi>

I want it to be recognized as

<wi num="11" entity="ORGANIZATION">Federal Reserve Bank of New York</wi>

Is this possible?


Something similar is, yes. If you give the flag

-outputFormat inlineXML

then you'll get:

<ORGANIZATION>Federal Reserve Bank of New York</ORGANIZATION>

(Note that this isn't really changing how Stanford NER works but just the formatting of output. If you don't like any of the provided output formats, it is fairly simple to write your own.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜