python izip which cycles through all iterables until the longest finishes
This turned out not to be a trivial task for me and I couldn't find any receipt so maybe you can point me to one or you have a ready, proper and well-tuned solution for that? Proper meaning works also for iterators that do not know own length (without __len__
) and works for exhaustible iterators (e.g. chained iterators); well-tuned meaning fast.
Note: in place solution is not possible due to necessity to cache iterators outputs to re-iterate them (Glenn Maynard pointed that out).
Example usage:
>>> list(izip_cycle(range(2), range(5), range(3)))
[(0, 0, 0), (1, 1, 1), (0, 2, 2), (1, 3, 0), (0, 4, 1)]
>>> from iterto开发者_JAVA技巧ols import islice, cycle, chain
>>> list(islice(izip_cycle(cycle(range(1)), chain(range(1), range(2))), 6))
[(0, 0), (0, 0), (0, 1), (0, 0), (0, 0), (0, 1)]
Here is something inspired by itertools.tee
and itertools.cycle
. It works for any kind of iterable:
class izip_cycle(object):
def __init__(self, *iterables ):
self.remains = len(iterables)
self.items = izip(*[self._gen(it) for it in iterables])
def __iter__(self):
return self.items
def _gen(self, src):
q = []
for item in src:
yield item
q.append(item)
# done with this src
self.remains -=1
# if there are any other sources then cycle this one
# the last souce remaining stops here and thus stops the izip
if self.remains:
while True:
for item in q:
yield item
A simple approach which might work for you, depending on your requirement is:
import itertools
def izip_cycle(*colls):
maxlen = max(len(c) if hasattr(c,'__len__') else 0 for c in colls)
g = itertools.izip(*[itertools.cycle(c) for c in colls])
for _ in range(maxlen):
yield g.next()
The first thing this does it find the length of longest sequence so it knows how many times to repeat. Sequences without __len__
are counted as having 0 length; this might bewhat you want - if you have an unending sequence you probably want to repeat over the finite sequences. Although this doesn't handle finite iterators with no length.
Never we use itertools.cycle
to create a cycling version of each iterator and then use itertools.zip
to zip them together.
Finally we yield each entry from our zip until we've given our desired number of results.
If you want this to work for finite iterator with no len
we need to do more of the work ourselves:
def izip_cycle(*colls):
iters = [iter(c) for c in colls]
count = len(colls)
saved = [[] for i in range(count)]
exhausted = [False] * count
while True:
r = []
for i in range(count):
if not exhausted[i]:
try:
n = iters[i].next()
saved[i].append(n)
r.append(n)
except StopIteration:
exhausted[i] = True
if all(exhausted):
return
saved[i] = itertools.cycle(saved[i])
if exhausted[i]:
r.append(saved[i].next())
yield r
This is basically an extension of the Python implementation of itertools.cycle
in the documentation to run over multiple sequences. We savd up items we've seen in saved
to repeat and track which sequences have run out in exhausted
.
As this version waits for all the sequences to run out, if you pass in something infinite the cycling will run on forever.
def izip_cycle_inplace(*iterables):
def wrap(it):
empty = True
for x in it: empty = yield x
if empty: return
next(counter)
while True:
empty = True
for x in it: empty = yield x
if empty: raise ValueError('cannot cycle iterator in-place')
iterators = [wrap(i) for i in iterables]
counter = iter(iterators)
next(counter)
while True:
yield [next(i) for i in iterators]
def izip_cycle(*iterables):
def wrap(it):
elements = []
for x in it:
yield x
elements.append(x)
if not elements: return
next(counter)
while True:
for x in elements: yield x
iterators = [wrap(i) for i in iterables]
counter = iter(iterators)
next(counter)
while True:
yield [next(i) for i in iterators]
精彩评论