开发者

Can I create a corpus from a collection of strings in NLTK? [duplicate]

This question already has answers here: Creating a new corpus with NLTK (4 answers) Closed 9 years ago.

Is there a way to create a corpus without having to have items in files. For instance, I want to manipulate Tweets or paragraphs that I am grabbing from the web. Can I do something like

myCorpus = MyCorp开发者_开发问答us([
    ('id', 'item', 'category'), 
    ('id', 'item', 'category'),
    ('id', 'item', 'category'), 
    ... ])

Or

myCorpus.add('id', 'item', 'category')

The purpose is to manipulate the corpus with existing NLTK capabilities. I checked TextCollection but it seems that it doesn't handle categories.


Why not just write the strings out to a file or files and then process them as a corpus?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜