python urllib2 question
I am trying to print some info from an url, but I want to skip the print if a certain text if found, I have:
import urllib2
url_number = 1
url_number_str = number
a = 1
while a != 10:
f = urllib2.urlopen('http://example.com/?=' + str(url_number_str)
f_contents = f.read()
if f_contents != '{"Response":"Parse Error"}':
print f_contents
a += 1
url_number_str += 1
so {"Response":"Parse Error"} is the text that I want to find to avoid printing f.read() a开发者_运维问答nd load the NEXT url (Number 2)
Although your question is a bit unclear, try this:
f = urllib2.urlopen('http://example.com/?id=1000')
for line in f.readlines():
if line != '{"Response":"Parse Error"}':
print line
This loops over every line in the webpage, and stops at '{"Response":"Parse Error"}'
.
Edit: Nevermind, this is probably what you want:
f = urllib2.urlopen('http://example.com/?id=1000')
data = f.read()
if data != '{"Response":"Parse Error"}':
print data
This will print the entire webpage, unless it is '{"Response":"Parse Error"}'
.
read
reads a block of data. The actual size of this block is more than probably greater than the '{"Response":"Parse Error"}'
.
So you should search the string within the read data (see @harpyon's answer), using RE or strstr
like.
I think this is what you want:
a = 1
while a != 100:
f = urllib2.urlopen('http://example.com/?id=1000')
f_contents = f.read()
if f_contents != '{"Response":"Parse Error"}':
print f_contents
a += 1
Although if you're not wanting to get the same page 100 times, you might have forgotten to add a
into the URL.
精彩评论