List Comprehension for removing duplicates of characters in a string [duplicate]
Possible Duplicate:
How do you remove duplicates from a list whilst preserving order?
So the idea is the program takes a string of characters and removes the same string with any duplicated character only appearing once -- removing any duplicated copy of a character. So Iowa stays Iowa but the word eventually would become eventually
Here is an inefficient method:
x = 'eventually'
newx = ''.join([c for i,c in enumerate(x) if c not in x[:i]])
I don't think that there is an efficient way to do it in a list comprehension.
Here it is as an O(n) (average case) generator expression. The others are all roughly O(n2).
chars = set()
string = "aaaaa"
newstring = ''.join(chars.add(char) or char for char in string if char not in chars)
It works because set.add
returns None
, so the or
will always cause the character to be yielded from the generator expression when the character isn't already in the set
.
Edit: Also see refaim's solutions. My solution is like his second one, but it uses the set
in the opposite way.
My take on his OrderedDict
solution:
''.join(OrderedDict((char, None) for char in word))
Without list comprehensions:
from collections import OrderedDict
word = 'eventually'
print ''.join(OrderedDict(zip(word, range(len(word)))).keys())
With list comprehensions (quick and dirty solution):
word = 'eventually'
uniq = set(word)
print ''.join(c for c in word if c in uniq and not uniq.discard(c))
>>> s='eventually'
>>> "".join([c for i,c in enumerate(s) if i==s.find(c)])
'evntualy'
note that using a list comprehension with join()
is silly when you can just use a generator expression. You should tell your teacher to update their question
You could make a set
from the string, then join it together again. This works since sets can only contain unique values. The order wont be the same though:
In [1]: myString = "mississippi"
In [2]: set(myString))
Out[2]: set(['i', 'm', 'p', 's'])
In [3]: print "".join(set(myString))
Out[3]: ipsm
In [4]: set("iowa")
Out[4]: set(['a', 'i', 'o', 'w'])
In [5]: set("eventually")
Out[5]: set(['a', 'e', 'l', 'n', 't', 'u', 'v', 'y'])
Edit: Just saw the "List Comprehension" in the title so this probably isnt what your looking for.
Create a set from the original string, and then sort by position of character in original string:
>>> s='eventually'
>>> ''.join(sorted(set(s), key=s.index))
'evntualy'
Taken from this question, I think this is the fastest way:
>>> def remove_dupes(str):
... chars = set()
... chars_add = chars.add
... return ''.join(c for c in str if c not in chars and not chars_add(c))
...
>>> remove_dupes('hello')
'helo'
>>> remove_dupes('testing')
'tesing'
word = "eventually"
evntualy = ''.join(
c
for d in [dict(zip(word, word))]
for c in word
if d.pop(c, None) is not None)
Riffing off of agf's (clever) solution but without making a set outside of the generator expression:
evntualy = ''.join(s.add(c) or c for s in [set()] for c in word if c not in s)
精彩评论