开发者

Elegant way to remove contiguous repeated elements in a list [closed]

Closed. This questio开发者_StackOverflown is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 1 year ago.

Improve this question

I'm looking for a clean, Pythonic, way to eliminate from the following list:

li = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]

all contiguous repeated elements (runs longer than one number) so as to obtain:

re = [0, 1, 2, 4, 3, 1]

but although I have working code, it feels un-Pythonic and I am quite sure there must be a way out there (maybe some lesser known itertools functions?) to achieve what I want in a far more concise and elegant way.


Here is a version based on Karl's answer which doesn't requires copies of the list (tmp, the slices, and the zipped list). izip is significantly faster than (Python 2) zip for large lists. chain is slightly slower than slicing but doesn't require a tmp object or copies of the list. islice plus making a tmp is a bit faster, but requires more memory and is less elegant.

from itertools import izip, chain
[y for x, y, z in izip(chain((None, None), li),
                       chain((None,), li),
                       li) if x != y != z]

A timeit test shows it to be approximately twice as fast as Karl's answer or my fastest groupby version for short groups.

Make sure to use a value other than None (like object()) if your list can contain Nones.

Use this version if you need it to work on an iterator / iterable that isn't a sequence, or your groups are long:

[key for key, group in groupby(li)
        if (next(group) or True) and next(group, None) is None]

timeit shows it's about ten times faster than the other version for 1,000 item groups.

Earlier, slow versions:

[key for key, group in groupby(li) if sum(1 for i in group) == 1]
[key for key, group in groupby(li) if len(tuple(group)) == 1]


agf's answer is good if the size of the groups is small, but if there are enough duplicates in a row, it will be more efficient not to "sum 1" over those groups

[key for key, group in groupby(li) if all(i==0 for i,j in enumerate(group)) ]


tmp = [object()] + li + [object()]
re = [y for x, y, z in zip(tmp[2:], tmp[1:-1], tmp[:-2]) if y != x and y != z]


The other solutions are using various itertools helpers, and comprehensions, and probably look more "Pythonic". However, a quick timing test I ran showed this generator was a bit faster:

_undef = object()

def itersingles(source):
    cur = _undef
    dup = True
    for elem in source:
        if dup:
            if elem != cur:
                cur = elem
                dup = False
        else:
            if elem == cur:
                dup = True
            else:
                yield cur
                cur = elem
    if not dup:
        yield cur

source = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]
result = list(itersingles(source))
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜