开发者

Why urllib2.urlopen can not open pages like "http://localhost/new-post#comment-29"?

I'm curious, how come I get 404 error running this line:

urllib2.urlopen("http://localhost/new-post#comment-29")

While everything works fine surfing http://localhost/new-post#comment-29 in any browser...

urlopen method does not parse urls w开发者_如何学Goith "#" in it?

Anybody knows?


In the HTTP protocol, the fragment (from # onwards) is not sent to the server across the network: it's locally retained by the browser and used, once the server's response is fully received, to somehow "visually locate" the exact spot in the page to be shown as "current" (for example, if the returned page is in HTML, this will be done by parsing the HTML and looking for the first suitable <a> flag).

So, the procedure is: remove the fragment e.g. via urlparse.urlparse; use the rest to fetch the resource; parse it appropriately based on the server response's content-type header; then take whatever visual action your program does regarding the "current spot" on the resource, based on locating within the parsed resource the fragment you retained in the first step.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜