How to replace pairs of tokens in a string?

2023-02-17 02:18 问答作者：

New to python, competent in a few languages, but can't see a 'snazzy' way of doing the following. I'm sure it's screaming out for a regex, but any solution I can come up with (using regex groups and what not) becomes insane quite quickly.

So, I have a string with html-like tags that I want to replace with actual html tags.

For example:

Hello, my name is /bJane/b.

Should become:

Hello, my name is <b>Jane</b>.

It might be combo'd with [i]talic and [u]nderline as well:

/iHello/i, my /uname/u is /b/i/uJane/b/i/u.

Should become:

<i>Hello</i>, my <u>name</u> is <b><i><u>Jane</b></i></u>.

Obviously a straight str.replace won't work because every 2nd token needs to be preceeded with the forwardslash.

For clarity, if tokens are being combo'd, it's always first opened, first close开发者_如何学运维d.

Many thanks!

PS: Before anybody gets excited, I know that this sort of thing should be done with CSS, blah, blah, blah, but I didn't write the software, I'm just reversing its output!

Maybe something like this can help :

import re


def text2html(text):
    """ Convert a text in a certain format to html.

    Examples:
    >>> text2html('Hello, my name is /bJane/b')
    'Hello, my name is <b>Jane</b>'
    >>> text2html('/iHello/i, my /uname/u is /b/i/uJane/u/i/b')
    '<i>Hello</i>, my <u>name</u> is <b><i><u>Jane</u></i></b>'

    """

    elem = []

    def to_tag(match_obj):
        match = match_obj.group(0)
        if match in elem:
            elem.pop(elem.index(match))
            return "</{0}>".format(match[1])
        else:
            elem.append(match)
            return "<{0}>".format(match[1])

    return re.sub(r'/.', to_tag, text)

if __name__ == "__main__":
    import doctest
    doctest.testmod()

with sed:

s/\/([biu])([^/]\+)\/\1/<\1>\2<\/\1>/g

A very simple solution would be to split the string using the source tag '/b' and rejoin the array of substring with the new destination tag '' like this:

s = "Hello, my name is /bJane/b."
'<b>'.join(s.split('/b'))
print s

'Hello, my name is <b>Jane<b>.'

继续阅读：python regex token

How to replace pairs of tokens in a string?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？