Download and write .tar.gz files without corruption
How do you download files, specifically .zip and .tar.gz, with Ruby and write them to the disk?
—This question was originally specific to a bug in MacRuby, but the answers are relevant to the above general question.
Using MacRuby, I've found that the file appears to be the same as the reference (in size), but t开发者_如何学运维he archives refuse to extract. What I'm attempting now is at: https://gist.github.com/arbales/8203385Thanks!
I've successfully downloaded and extracted GZip files with this code:
require 'open-uri'
require 'zlib'
open('tarball.tar', 'w') do |local_file|
open('http://github.com/jashkenas/coffee-script/tarball/master/tarball.tar.gz') do |remote_file|
local_file.write(Zlib::GzipReader.new(remote_file).read)
end
end
I'd recommend using open-uri in ruby's stdlib.
require 'open-uri'
open(out_file, 'w') do |out|
out.write(open(url).read)
end
http://ruby-doc.org/stdlib/libdoc/open-uri/rdoc/classes/OpenURI/OpenRead.html#M000832
Make sure you look at the :progress_proc option to open as it looks like you want a progress hook.
The last time I got currupted files with Ruby was when I forgot to call file.binmode
right after File.open
. Took me hours to find out what was wrong. Does it help with your issue?
When downloading a .tar.gz
with open-uri
via a simple open()
call, I was also getting errors uncompressing the file on disk. I eventually noticed that the file size was much larger than expected.
Inspecting the file download.tar.gz
on disk, what it actually contained was download.tar
uncompressed; and that could be untarred. This seems to be due to an implicit Accept-encoding: gzip
header on the open()
call which makes sense for web content, but is not what I wanted when retrieving a gzipped tarball. I was able to work around it and defeat that behavior by sending a blank Accept-encoding
header in the optional hash argument to the remote open()
:
open('/local/path/to/download.tar.gz', 'wb') do |file|
# Send a blank Accept-encoding header
file.write open('https://example.com/remote.tar.gz', {'Accept-encoding'=>''}).read
end
精彩评论