Write PDF file from URL using urllib2
I'm trying to save a dynamic pdf file generated from a web server using python's module urllib2. I use following code to get data from server and to write that data to a file in order to store the pdf in a local disk.:
import urllib2
import cookielib
theurl = 'https://myweb.com/?pdf&var1=1'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders.append(('Cookie', cookie))
request = urllib2.Request(theurl)
print("... Sending HTTP GET to %s" % theurl)
f = opener.open(request)
data = f.read()
f.close()
opener.close()
FILE = open('report.pdf', "w")
FILE.write(data)
FILE.close()
This code runs well but the written pdf file is not well recognized by adobe reader. If开发者_如何转开发 I do the request manually using firefox, I have no problems to receive the file and I can visualize it withouut problems. Comparing the received http headers (firefox and urrlib) the only difference is a http header field called "Transfer-Encoding = chunked". This field is received in firefox but it seems that is not received when I do the urllib request. Any suggestion?
Try changing,
FILE = open('report.pdf', "w")
to
FILE = open('report.pdf', "wb")
The extra 'b' indicates to write in binary mode. Currently you are writing a binary file in ASCII/text mode.
精彩评论