How to Download Files using Python?
HI, everyone. I am new to Python and am using Python 2.5 on CentOS.
I need to download files like WGET
do.
I have done some search, and there are some solutions, an obvious way is this:
import urllib2
mp3file = urllib2.urlopen("http://www.example.com/songs/mp3.mp3")
output = op开发者_开发知识库en('test.mp3','wb')
output.write(mp3file.read())
output.close()
This works fine. But I want to know, if the mp3 file is VERY large, like 1Gb, 2Gb or even bigger. Can this code snippet still work? Are there better ways to download large files in Python, maybe with a progress bar like WGET
do.
Thanks a lot!
There's an easier way:
import urllib
urllib.urlretrieve("http://www.example.com/songs/mp3.mp3", "/home/download/mp3.mp3")
For really big files, your code would use a lot of memory, since you load the whole file into the memory at once. It might be better to read and write the data in chunks:
from __future__ import with_statement
import urllib2
mp3file = urllib2.urlopen("http://www.example.com/songs/mp3.mp3")
with open('test.mp3','wb') as output:
while True:
buf = mp3file.read(65536)
if not buf:
break
output.write(buf)
Why not just call wget
then?
import os
os.system ("wget http://www.example.com/songs/mp3.mp3")
your current code will read the entire stream into memory before writing to disk. So for instances where the file is larger than your available memory, you will run into problems.
to resolve this, you can read chunks at a time and write them to file.
(copied from Stream large binary files with urllib2 to file)
req = urllib2.urlopen(url)
CHUNK = 16 * 1024
with open(file, 'wb') as fp:
while True:
chunk = req.read(CHUNK)
if not chunk: break
fp.write(chunk)
"experiment a bit with various CHUNK sizes to find the "sweet spot" for your requirements."
精彩评论