开发者

python urllib2 question

I am trying to print some info from an url, but I want to skip the print if a certain text if found, I have:

import urllib2

url_number = 1
url_number_str = number
a = 1

while a != 10:
    f = urllib2.urlopen('http://example.com/?=' + str(url_number_str)
    f_contents = f.read()
    if f_contents != '{"Response":"Parse Error"}':
        print f_contents
        a += 1
        url_number_str += 1

so {"Response":"Parse Error"} is the text that I want to find to avoid printing f.read() a开发者_运维问答nd load the NEXT url (Number 2)


Although your question is a bit unclear, try this:

f = urllib2.urlopen('http://example.com/?id=1000')
for line in f.readlines():
    if line != '{"Response":"Parse Error"}':
        print line

This loops over every line in the webpage, and stops at '{"Response":"Parse Error"}'.

Edit: Nevermind, this is probably what you want:

f = urllib2.urlopen('http://example.com/?id=1000')
data = f.read()
if data != '{"Response":"Parse Error"}':
    print data

This will print the entire webpage, unless it is '{"Response":"Parse Error"}'.


read reads a block of data. The actual size of this block is more than probably greater than the '{"Response":"Parse Error"}' .

So you should search the string within the read data (see @harpyon's answer), using RE or strstr like.


I think this is what you want:

a = 1

while a != 100:
    f = urllib2.urlopen('http://example.com/?id=1000')
    f_contents = f.read()
    if f_contents != '{"Response":"Parse Error"}':
         print f_contents
    a += 1

Although if you're not wanting to get the same page 100 times, you might have forgotten to add a into the URL.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜