Regex for removing whitespace
def remove_whitespaces(value):
"Remove all whitespaces"
p = re.compile(r'\s+')
return p.sub(' '开发者_运维问答, value)
The above code strips tags but doesn't remove "all" whitespaces from the value.
Thanks
The fastest general approach eschews REs in favor of string's fast, powerful .translate
method:
import string
identity = string.maketrans('', '')
def remove_whitespace(value):
return value.translate(identity, string.whitespace)
In 2.6, it's even simpler, just
return value.translate(None, string.whitespace)
Note that this applies to "plain" Python 2.* strings, i.e., bytestrings -- Unicode's strings' .translate
method is somewhat different -- it takes a single argument which must be a mapping of ord
values for Unicode characters to Unicode strings, or None
for deletion. I.e., taking advantage of dict
's handy .fromkeys
classmethod:
nospace = dict.fromkeys(ord(c) for c in string.whitespace)
def unicode_remove_whitespace(value):
return value.translate(nospace)
to remove exactly the same set of characters. Of course, Unicode also has more characters you could consider whitespace and want to remove -- so you'd probably want to build a mapping unicode_nospace
based on information from module unicodedata, rather than using this simpler approach based on module string.
p.sub(' ', value)
should be
p.sub('', value)
The former replaces all whitespace with a single space, the latter replaces with nothing.
Maybe value.join(p.split()) ''.join(value.split()) could work for you?
re.sub('\s*', '', value)
should also work!
re.sub(r'\s', '', value)
function works well for me, in this case.
@OP, compile your regex pattern outside, so you don't have to call re.compile everytime you use the procedure. Also you are substituting back to one space, that is not removing spaces is it?
p = re.compile(r'\s+')
def remove_whitespaces(p,value):
"Remove all whitespaces"
return p.sub('', value)
lastly, another method not using regex is to just split on whitespaces and joining them up again
def remove_whitespaces(value):
"Remove all whitespaces"
return ''.join(value.split())
精彩评论