开发者

Python unescape URL

I have got a url in this form - http:\\/\\/en.wikipedia.org\\/wiki\\/The_Truman_Show. How can I make it normal url. I have tried using urllib.unquote without much success.

I can always use regular expressions or some simple string replace stuff. But I believe that there is a better way开发者_StackOverflow中文版 to handle this...


urllib.unquote is for replacing %xx escape codes in URLs with the characters they represent. It won't be useful for this.

Your "simple string replace stuff" is probably the best solution.


Have you tried using json.loads from the json module?

>>> json.loads('"http:\\/\\/en.wikipedia.org\\/wiki\\/The_Truman_Show"')
'http://en.wikipedia.org/wiki/The_Truman_Show'

The input that I'm showing isn't exactly what you have. I've wrapped it in double quotes to make it valid json.

When you first get it from the json, how are you decoding it? That's probably where the problem is.


It is too childish -- look for some library function when you can transform URL by yourself. Since there are not other visible rules but "/" replaced by "\/", you can simply replace it back:

def unescape_this(url):
    return url.replace(r"\\/", "/")
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜