python: cleaning up a string
i have a string like this
somestring='in this/ string / i have many. interesting.occurrences of {different chars} that need to .be removed '
here is the result i want:
somestring='in this string i have many interesting occurrences of different chars that need to be removed'
i started to manually do all kinds of .replace
, but there are so many different combinations that i think there must be a simpler way. perhaps ther开发者_如何转开发e's a library that already does this?
does anyone know how i can clean up this string>?
I would use regular expression to replace all non-alphanumerics to spaces:
>>> import re
>>> somestring='in this/ string / i have many. interesting.occurrences of {different chars} that need to .be removed '
>>> rx = re.compile('\W+')
>>> res = rx.sub(' ', somestring).strip()
>>> res
'in this string i have many interesting occurrences of different chars that need to be removed'
You have two steps: remove the punctuation then remove the extra whitespace.
1) Use string.translate
import string
trans_table = string.maketrans( string.punctuation, " "*len(string.punctuation)
new_string = some_string.translate(trans_table)
This makes then applies a translation table that maps punctuation characters to whitespace.
2) Remove excess whitespace
new_string = " ".join(new_string.split())
re.sub('[\[\]/{}.,]+', '', somestring)
精彩评论