python regex match and replace
I need to find, process and remove (one by one) any substrings that match a rather long regex:
# p is a compiled regex
# s is a string
while 1:
m = p.match(s)
if m is None:
break
process(m.group(0)) #do something with the matched pattern
s = re.sub(m.group(0), '', s) #remove it from string s
The code above is not good for 2 reasons:
It doesn't work if m.gro开发者_C百科up(0) happens to contain any regex-special characters (like *, +, etc.).
It feels like I'm duplicating the work: first I search the string for the regular expression, and then I have to kinda go look for it again to remove it.
What's a good way to do this?
The re.sub function can take a function as an argument so you can combine the replacement and processing steps if you wish:
# p is a compiled regex
# s is a string
def process_match(m):
# Process the match here.
return ''
s = p.sub(process_match, s)
精彩评论