开发者

Python generator vs callback function

I have a class that solves an exact cover problem using a recursive, backtracking algorithm. Originally, I implemented the class with a callback function I passed to the object during initialization. This callback is invoked whenever a solution is found. In looking at someone else's implementation of the same problem, I saw that they were using yield statements to pass a solution out, in other words, their code was a python generator. I thought this was an interesting idea so I made a new version of my class to use yields. I then ran comparison tests between the two versions and, to my surprise, I found the generator version ran 5 times slower than the callback version. Note that, except for switching in a yield for a callback, the code is 开发者_JAVA百科identical.

What is going on here? I'm speculating that, because a generator needs to save state information before yielding and then restore that state when restarting at the next call, it is this save/restore that is what makes the generator version run so much slower. If this is the case, how much state information is the generator having to save and restore?

Any ideas from the python experts?

--Edited 7:40 PDT

Here is the solver code which uses yield. Replace the first yield below with a call to the callback function and change the following loop with the second yield to just a recursive call to solve for the original version of this code.

   def solve(self):
      for tp in self.pieces:
         if self.inuse[tp.name]: continue

         self.inuse[tp.name] = True
         while tp.next_orientation() is not None:
            if tp.insert_piece():
               self.n_trials += 1
               self.pieces_in += 1
               self.free_cells -= tp.size

               if self.pieces_in == len(self.pieces) or self.free_cells == 0:
                  self.solutions += 1
                  self.haveSolution = True
                  yield True
                  self.haveSolution = False
               else:
                  self.table.next_base_square()
                  for tf in self.solve():
                     yield tf

               tp.remove_piece()
               self.pieces_in -= 1
               self.table.set_base_square(tp.base_square)
               self.free_cells += tp.size

         self.inuse[tp.name] = False
         tp.reset_orientation()

The mail loop which invokes the solver (after initialization, of course) is

   start_time = time.time()
   for tf in s.solve():
      printit(s)

   end_time = time.time()
   delta_time = end_time - start_time

In the callback version, the loop is gone with just a single call to solve.


What i meant in my comment, ("yielding from a recursive function sounds like it requires extra for loops to pass the results down to the caller") is this line:

          for tf in self.solve():
             yield tf

These lines recursively loop over the results from the deeper recursion stages. That means that a single result is iterated over on each level of the recursion, resulting in a lot of unnecessary looping.

Let me illustrate with this example:

n = 0
def rekurse(z):
    global n
    if z:
        yield z
        for x in rekurse(z-1):
            n += 1
            yield x

print list(rekurse(10))
print n

As you can see this simply counts down from 10, so you'd expect a a linear number of iterations. What you can see though is that n grows quadratically - recurse(10) loops over 9 items, recurse(9) over 8 items and so on.

The more items you have, the more time Python spends on these simple lines. Callbacks completely avoid that problem, so I'd suspect that is the problem with your code.

A optimized implementation of PEP 380 could fix this (see this paragraph). In the meantime I don't think it's a good idea to yield from recursive functions (at least if they recurse deeply), they just don't work well together.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜