开发者

How to make string from regex and value of group

I have regexp for twitter profile url and someone's twitter profile url. I can easily extract username from url.

>>> twitter_re = re.compile('twitter.com/(?P<username>\w+)/')
>>> twitter_url = 'twitter.com/开发者_高级运维dir01/'
>>> username = twitter_re.search(twitter_url).groups()[0]
>>> _
'dir01'

But if I have regexp and username, how do I get url?


Regexen are no two-way street. You can use them for parsing strings, but not for generating strings back from the result. You should probably look into another way of getting the URLs back, like basic string interpolation, or URI templates (see http://code.google.com/p/uri-templates/)


If you are not looking for a general solution to convert any regex into a formatting string, but something that you can hardcode:

twitter_url = 'twitter.com/%(username)s/' % {'username': 'dir01'}

...should give you what you need.

If you want a more general (but not incredibly robust solution):

import re

def format_to_re(format):
    # Replace Python string formatting syntax with named group re syntax.
    return re.compile(re.sub(r'%\((\w+)\)s', r'(?P<\1>\w+)', format))

twitter_format = 'twitter.com/%(username)s/'
twitter_re = format_to_re(twitter_format)

m = twitter_re.search('twitter.com/dir01/')
print m.groupdict()
print twitter_format % m.groupdict()

Gives me:

{'username': 'dir01'}
twitter.com/dir01/

And finally, the slightly larger and more complete solution that I have been using myself can be found in the Pattern class here.


Why do you need the regex for that - just append the strings.

base_url = "twitter.com/"
twt_handle = "dir01"
twit_url = base_url + twt_handle
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜