开发者

How to delete () using re module in Python

I am in trouble for processing XML text. I want to delete () from my text as follows:

from <b>(apa-bhari(n))</b> to <b>apa-bhari(n)</b>

The following code was made

name= re.sub('<b>\((.+)\)</b>','<b>\1</b>',name)

But this can only returns

<b></b>

I do not understand escape sequences and backreference. Please tell me the solution.开发者_开发问答


You need to use raw strings, or escape the slashes:

name = re.sub(r'<b>\((.+)\)</b>', r'<b>\1</b>', name)


You need to escape backslashes in Python strings if followed by a number; the following expressions are all true:

assert '\1' == '\x01'
assert len('\\1') == 2
assert '\)' == '\\)'

So, your code would be

name = re.sub('<b>\\((.+)\\)</b>','<b>\\1</b>',name)

Alternatively, use the regular expression string definition:

name = re.sub(r'<b>\((.+)\)</b>', r'<b>\1</b>',name)


Try:

name= re.sub('<b>\((.+)\)</b>','<b>\\1</b>',name)

or if you do not want to have an illisible code with \\ everywhere you are using backslashes, do not escape manually backslashes, but add an r before the string, ex: r"myString\" is the same as "myString\\".

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜