Detect English verb tenses using NLTK
I am looking for a way given an English text count verb phrases in it in past, present and future tenses. For now I am usi开发者_StackOverflow社区ng NLTK, do a POS (Part-Of-Speech) tagging, and then count say 'VBD' to get past tenses. This is not accurate enough though, so I guess I need to go further and use chunking, then analyze VP-chunks for specific tense patterns. Is there anything existing that does that? Any further reading that might be helpful? The NLTK book is focused mostly on NP-chunks, and I can find quite few info on VP-chunks.
Thee exact answer depends on which chunker you intend to use, but list comprehensions will take you a long way. This gets you the number of verb phrases using a non-existent chunker.
len([phrase for phrase in nltk.Chunker(sentence) if phrase[1] == 'VP'])
You can take a more fine-grained approach to detect numbers of tenses.
You can do this with either the Berkeley Parser or Stanford Parser. But I don't know if there's a Python interface available for either.
精彩评论