Elegant way to remove contiguous repeated elements in a list [closed]
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this questionI'm looking for a clean, Pythonic, way to eliminate from the following list:
li = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]
all contiguous repeated elements (runs longer than one number) so as to obtain:
re = [0, 1, 2, 4, 3, 1]
but although I have working code, it feels un-Pythonic and I am quite sure there must be a way out there (maybe some lesser known itertools
functions?) to achieve what I want in a far more concise and elegant way.
Here is a version based on Karl's answer which doesn't requires copies of the list (tmp
, the slices, and the zipped list). izip
is significantly faster than (Python 2) zip
for large lists. chain
is slightly slower than slicing but doesn't require a tmp
object or copies of the list. islice
plus making a tmp
is a bit faster, but requires more memory and is less elegant.
from itertools import izip, chain
[y for x, y, z in izip(chain((None, None), li),
chain((None,), li),
li) if x != y != z]
A timeit
test shows it to be approximately twice as fast as Karl's answer or my fastest groupby
version for short groups.
Make sure to use a value other than None
(like object()
) if your list can contain None
s.
Use this version if you need it to work on an iterator / iterable that isn't a sequence, or your groups are long:
[key for key, group in groupby(li)
if (next(group) or True) and next(group, None) is None]
timeit
shows it's about ten times faster than the other version for 1,000 item groups.
Earlier, slow versions:
[key for key, group in groupby(li) if sum(1 for i in group) == 1]
[key for key, group in groupby(li) if len(tuple(group)) == 1]
agf's answer is good if the size of the groups is small, but if there are enough duplicates in a row, it will be more efficient not to "sum 1" over those groups
[key for key, group in groupby(li) if all(i==0 for i,j in enumerate(group)) ]
tmp = [object()] + li + [object()]
re = [y for x, y, z in zip(tmp[2:], tmp[1:-1], tmp[:-2]) if y != x and y != z]
The other solutions are using various itertools helpers, and comprehensions, and probably look more "Pythonic". However, a quick timing test I ran showed this generator was a bit faster:
_undef = object()
def itersingles(source):
cur = _undef
dup = True
for elem in source:
if dup:
if elem != cur:
cur = elem
dup = False
else:
if elem == cur:
dup = True
else:
yield cur
cur = elem
if not dup:
yield cur
source = [0, 1, 2, 3, 3, 4, 3, 2, 2, 2, 1, 0, 0]
result = list(itersingles(source))
精彩评论