Stanford NER - Extract Multi word entities
How can I tag collocations in Stanford NER? Currently it tags Federal Reserve Bank of New York
as
<wi num="11" entity="ORGANIZATION">Federal</wi> <wi num="12" entity="ORGANIZATION">Reserve</wi> <wi num="13" entity="ORGANIZATION">Bank</wi> <wi num="14" entity="ORGANIZATION">of</wi> <w开发者_高级运维i num="15" entity="ORGANIZATION">New</wi> <wi num="16" entity="ORGANIZATION">York</wi>
I want it to be recognized as
<wi num="11" entity="ORGANIZATION">Federal Reserve Bank of New York</wi>
Is this possible?
Something similar is, yes. If you give the flag
-outputFormat inlineXML
then you'll get:
<ORGANIZATION>Federal Reserve Bank of New York</ORGANIZATION>
(Note that this isn't really changing how Stanford NER works but just the formatting of output. If you don't like any of the provided output formats, it is fairly simple to write your own.)
精彩评论