How to replace pairs of tokens in a string?
New to python, competent in a few languages, but can't see a 'snazzy' way of doing the following. I'm sure it's screaming out for a regex, but any solution I can come up with (using regex groups and what not) becomes insane quite quickly.
So, I have a string with html-like tags that I want to replace with actual html tags.
For example:
Hello, my name is /bJane/b.
Should become:
Hello, my name is <b>Jane</b>.
It might be combo'd with [i]talic and [u]nderline as well:
/iHello/i, my /uname/u is /b/i/uJane/b/i/u.
Should become:
<i>Hello</i>, my <u>name</u> is <b><i><u>Jane</b></i></u>.
Obviously a straight str.replace won't work because every 2nd token needs to be preceeded with the forwardslash.
For clarity, if tokens are being combo'd, it's always first opened, first close开发者_如何学运维d.
Many thanks!
PS: Before anybody gets excited, I know that this sort of thing should be done with CSS, blah, blah, blah, but I didn't write the software, I'm just reversing its output!
Maybe something like this can help :
import re
def text2html(text):
""" Convert a text in a certain format to html.
Examples:
>>> text2html('Hello, my name is /bJane/b')
'Hello, my name is <b>Jane</b>'
>>> text2html('/iHello/i, my /uname/u is /b/i/uJane/u/i/b')
'<i>Hello</i>, my <u>name</u> is <b><i><u>Jane</u></i></b>'
"""
elem = []
def to_tag(match_obj):
match = match_obj.group(0)
if match in elem:
elem.pop(elem.index(match))
return "</{0}>".format(match[1])
else:
elem.append(match)
return "<{0}>".format(match[1])
return re.sub(r'/.', to_tag, text)
if __name__ == "__main__":
import doctest
doctest.testmod()
with sed:
s/\/([biu])([^/]\+)\/\1/<\1>\2<\/\1>/g
A very simple solution would be to split the string using the source tag '/b' and rejoin the array of substring with the new destination tag '' like this:
s = "Hello, my name is /bJane/b."
'<b>'.join(s.split('/b'))
print s
'Hello, my name is <b>Jane<b>.'
精彩评论