NLTK "generate" function: How to get back returned text?
I'm a Python noob, so bear with me.
I'm trying to work with the NLTK library, and in particular the 'generate' function. It looks like from the documentation this function simply prints its result (http://nltk.googlecode.com/svn/trunk/doc/api/nltk.text-pysrc.html). I would like to manipulate the resulting text prior to printing it to the screen, but I can't seem to figure out how to get this function to return its text.
How would I go about getting the output of this funct开发者_JS百科ion? Do I have to change the function to return the result instead of printing it?
UPDATE: I found this link which kinda does it, but it feels pretty darn hacky. http://northernplanets.blogspot.com/2006/07/capturing-output-of-print-in-python.html Is this the best I can hope for?
All generate
is doing is generating a trigram model if none exists, then calling
text = self._trigram_model.generate(length)
and wrapping and printing it.
Just take the parts you want -- possibly just the above line (with self
replaced by the instance name), or possibly the whole thing, as below, with the final print
replaced with return
.
def generate(self, length=100):
if '_trigram_model' not in self.__dict__:
estimator = lambda fdist, bins: LidstoneProbDist(fdist, 0.2)
self._trigram_model = NgramModel(3, self, estimator)
text = self._trigram_model.generate(length)
return tokenwrap(text) # or just text if you don't want to wrap
And then you can just call it with a manually passed instance as the first argument.
Go into Python26/site-packages/nltk/text.py and change the "generate" function:
def generate(self, length=100):
if '_trigram_model' not in self.__dict__:
print "Building ngram index..."
estimator = lambda fdist, bins: LidstoneProbDist(fdist, 0.2)
self._trigram_model = NgramModel(3, self, estimator)
text = self._trigram_model.generate(length)
text_gen = tokenwrap(text)
print text_gen
return text_gen`
精彩评论