Iterate through different permutations of 4 functions in Python
OK I am using different taggers to tag a text. Default, unigram, bigram and trigram.
I have to check which combination of three of those four taggers is the most accurate.
To do that i have to loop through all the possible combinations which i do like this:
permutaties = list(itertools.permutations(['default_tagger','unigram_tagger',
'bigram_tagger','trigram_tagger'],3))
resultaten = []
for element in permutaties:
resultaten.append(accuracy(element))
so each element is a tuple of three tagmethods like for example: ('default_tagger', 'bigram_tagger', 'trigram_tagger')
In the accuracy function I now have to dynamically call the three accompanying methods of each tagger, the problem is: I don't know how to do this.
The tagger functions are as follows:
unigram_tagger = nltk.UnigramTagger(brown_train, backoff=backofff)
bigram_tagger = nltk.BigramTagger(brown_train, backoff=backofff)
trigram_tagger = nltk.TrigramTagger(brown_train, backoff=backofff)
default_tagger = nltk.DefaultTagger('NN')
So for the example the code should become:
t0 = nltk.DefaultTagger('NN')
t1 = nltk.BigramTagger(brown_train, backoff开发者_StackOverflow中文版=t0)
t2 = nltk.TrigramTagger(brown_train, backoff=t1)
t2.evaluate(brown_test)
So in essence the problem is how to iterate through all 24 combinations of that list of 4 functions.
Any Python Masters that can help me?
Not shure if I understood what you need, but you can use the methods you want to call themselves instead of strings - sou your code could become soemthing like:
permutaties = itertools.permutations([nltk.UnigramTagger, nltk.BigramTagger, nltk.TrigramTagger, nltk.DefaultTagger],3)
resultaten = []
for element in permutaties:
resultaten.append(accuracy(element, brown_Train, brown_element))
def accuracy(element, brown_train,brown_element):
if element is nltk.DeafultTagger:
evaluator = element("NN")
else:
evaluator = element(brown_train, backoff=XXX) #maybe insert more elif
#clauses to retrieve the proper backoff parameter --or you could
# usr a tuple in the call to permutations so the apropriate backoff
#is avaliable for each function to be called
return evaluator.evaluate(brown_test) # ? I am not shure from your code if this is your intent
Starting with jsbueno's code, I suggest writing a wrapper function for each of the taggers to give them the same signature. And since you only need them once, I suggest using a lambda.
permutaties = itertools.permutations([lambda: ntlk.DefaultTagger("NN"),
lambda: nltk.UnigramTagger(brown_train, backoff),
lambda: nltk.BigramTagger(brown_train, backoff),
lambda: nltk.TrigramTagger(brown_train, backoff)],3)
This would allow you to call each directly, without a special function that figures out which function you're calling and employs the appropriate signature.
basing on jsbueno code I think that you want to reuse evaluator as the backoff argument so the code should be
permutaties = itertools.permutations([nltk.UnigramTagger, nltk.BigramTagger, nltk.TrigramTagger, nltk.DefaultTagger],3)
resultaten = []
for element in permutaties:
resultaten.append(accuracy(element, brown_Train, brown_element))
def accuracy(element, brown_train,brown_element):
evaluator = "NN"
for e in element:
if evaluator == "NN":
evaluator = e("NN")
else:
evaluator = e(brown_train, backoff=evaluator) #maybe insert more elif
#clauses to retrieve the proper backoff parameter --or you could
# usr a tuple in the call to permutations so the apropriate backoff
#is avaliable for each function to be called
return evaluator.evaluate(brown_test) # ? I am not shure from your code if this is your intent
精彩评论