Need to create a histogram in Python for a corpus

2023-04-09 00:59 问答作者：

import nltk
from nltk.book import *
from nltk.corpus import brown
corpus_text = brown.words()
word_freq = FreqDist(corpus_text)
word_hist = 开发者_运维知识库dict()

for k,v in word_freq.iteritems():
   if key in word_hist:
      word_hist[v] = word_hist[v] + 1
   else:
      word_hist[v] = 1 

print word_hist.viewkeys()
print word_hist.viewvalues()

I'm making a mistake at the dictionary handling here. Need to create a dictionary that has it's keys as the words from the freqdict and the values as the number of the corresponding word. how do I perform this increment?

I'm certain that

      word_hist[v] = word_hist[v] + 1
   else:
      word_hist[v] = 1

has a bug.

Of course. It seems you are replacing the word_hist dict with one of its values (plus 1). Try

word_hist[v] = word_hist[v] + 1

or even better

word_hist[v] += 1

instead.

EDIT: There is another bug:

for k,v in word_freq.iteritems():
   if key in word_hist:
      word_hist[v] = word_hist[v] + 1
   else:
      word_hist[v] = 1

makes no sense. key is tested for presence in word_hist, but then v is used.

I don't know what key is, but either use k or v for both.

from collections import defaultdict
word_hist = defaultdict(int)

for k,v in word_freq.iteritems():
    word_hist[v] +=1

that definitely has a bug, but so does the previous line.

if key in word_hist:
      word_hist[v] = word_hist[v] + 1
   else:
      word_hist[v] = 1

should be

if k in word_hist:
    word_hist[k] = word_hist[k] + 1
else:
    word_hist[k] = 1

you don't need to take v from the word_freq.

继续阅读：nltk python

Need to create a histogram in Python for a corpus

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？