开发者

Validating Syntax of a Sentence [closed]

Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 6 years ago.

开发者_如何学JAVA Improve this question

I'm looking for a library to simply validate the the syntax of english natural language sentences. It doesn't have to be correct all the time (and obviously some sentences will be ambiguous/ humans will disagree on validity).

So for example: jim likes the blue ball would be valid, whereas jim likes likes blue ball jim would not be.

I've tried "Syntactic parser of English sentences" by Andrej Pancik which appears to do what I want, but unfortunately most sentences I'd consider to be "valid" it doesn't consider to be.

Is there any code out there I can use? Otherwise I'm thinking to do this myself by creating parse tree with something like ANTLR and identifying nouns with WordNet.


You won't find this a) easy to do, or b) likely available as a package that just works.

People don't agree on what English is

  Colorless green ideas slept furiously.

thus you can't really write such a program that relaibly does what you want. There are NLP parsers that claim to process much of English, but they aren't simple or small; I belive the so-called Stanford parser is one.

You can try to build you own, but you'll smack into the definition-of-English problem, unless you strongly constrain what you consider to be valid english. And this will likely get you the same effect as you had with Pancik's parser. (The act of writing a parser is an insistence that the language looks like what the parser accepts, regardless of the truth).


Syntactic parsing is a broad research field. There are a lot of parsers available, but not in C#. The state-of-the-art parsers are listed in: http://aclweb.org/aclwiki/index.php?title=Parsing_(State_of_the_art)

A gentler starting point is NLTK, written in python.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜