开发者

python: keep char only if it is within this list

i have a list:

a = ['a','b','c'.........'A','B','C'.........'Z']

and i have string:

string1= 's#$%E开发者_C百科RGdfhliisgdfjkskjdfW$JWLI3590823r'

i want to keep ONLY those characters in string1 that exist in a

what is the most effecient way to do this? perhaps instead of having a be a list, i should just make it a string? like this a='abcdefg..........ABC..Z' ??


This should be faster.

>>> import re
>>> string1 = 's#$%ERGdfhliisgdfjkskjdfW$JWLI3590823r'
>>> a = ['E', 'i', 'W']
>>> r = re.compile('[^%s]+' % ''.join(a))
>>> print r.sub('', string1)
EiiWW

This is even faster than that.

>>> all_else = ''.join( chr(i) for i in range(256) if chr(i) not in set(a) )
>>> string1.translate(None, all_else)
'EiiWW'

44 microsec vs 13 microsec on my laptop.

How about that?

(Edit: turned out, translate yields the best performance.)


''.join([s for s in string1 if s in a])

Explanation:

[s for s in string1 if s in a]

creates a list of all characters in string1, but only if they are also in the list a.

''.join([...])

turns it back into a string by joining it with nothing ('') in between the elements of the given list.


List comprehension to the rescue!

wanted = ''.join(letter for letter in string1 if letter in a)

(Note that when passing a list comprehension to a function you can omit the brackets so that the full list isn't generated prior to being evaluated. While semantically the same as a list comprehension, this is called a generator expression.)


If, you are going to do this with large strings, there is a faster solution using translate; see this answer.


@katrielalex: To spell it out:

import string 
string1= 's#$%ERGdfhliisgdfjkskjdfW$JWLI3590823r'

non_letters= ''.join(chr(i) for i in range(256) if chr(i) not in string.letters)
print string1.translate(None,non_letters)

print 'Simpler, but possibly less correct'
print string1.translate(None, string.punctuation+string.digits+string.whitespace)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜