How to delete () using re module in Python
I am in trouble for processing XML text. I want to delete () from my text as follows:
from <b>(apa-bhari(n))</b>
to <b>apa-bhari(n)</b>
The following code was made
name= re.sub('<b>\((.+)\)</b>','<b>\1</b>',name)
But this can only returns
<b></b>
I do not understand escape sequences and backreference. Please tell me the solution.开发者_开发问答
You need to use raw strings, or escape the slashes:
name = re.sub(r'<b>\((.+)\)</b>', r'<b>\1</b>', name)
You need to escape backslashes in Python strings if followed by a number; the following expressions are all true:
assert '\1' == '\x01'
assert len('\\1') == 2
assert '\)' == '\\)'
So, your code would be
name = re.sub('<b>\\((.+)\\)</b>','<b>\\1</b>',name)
Alternatively, use the regular expression string definition:
name = re.sub(r'<b>\((.+)\)</b>', r'<b>\1</b>',name)
Try:
name= re.sub('<b>\((.+)\)</b>','<b>\\1</b>',name)
or if you do not want to have an illisible code with \\
everywhere you are using backslashes, do not escape manually backslashes, but add an r
before the string, ex: r"myString\"
is the same as "myString\\"
.
精彩评论