开发者

Continue getting a partially-downloaded file

Is there a Perl-tool, which behaves like wget --continue and is able to continue getting a partially-downl开发者_StackOverflowoaded file?


AnyEvent::HTTP's documentation contains code that uses HTTP 1.1's ability to resume a download. I never used it though, so I can't comment on its suitability.

Apparently the example expects you to already know how to use AnyEvent, which, of course, I did not know. You need to modify the code to have the event loop it expected to already be in place:

#!/usr/bin/perl

use strict;
use warnings;

use AnyEvent::HTTP;

my $url  = "http://localhost/foo.txt";
my $file = "foo.txt";

sub download {
    my ($url, $file, $cb) = @_;

    open my $fh, "+>>:raw", $file
        or die "could not open $file: $!";

    my %hdr;
    my $ofs = 0;

    if (stat $fh and $ofs = -s _) {
        $hdr{"if-unmodified-since"} = AnyEvent::HTTP::format_date((stat _)[9]);
        $hdr{"range"} = "bytes=$ofs-";
    }

    http_get $url, (
        headers   => \%hdr,
        on_header => sub {
            my ($hdr) = @_;

            if ($hdr->{Status} == 200 && $ofs) {
                # resume failed
                truncate $fh, $ofs = 0;
            }

            sysseek $fh, $ofs, 0;

            return 1;
        },
        on_body   => sub {
            my ($data, $hdr) = @_;

            if ($hdr->{Status} =~ /^2/) {
                length $data == syswrite $fh, $data
                    or return; # abort on write errors
            }

            return 1;
        },
        sub {
            my (undef, $hdr) = @_;

            my $status = $hdr->{Status};

            if (my $time = AnyEvent::HTTP::parse_date $hdr->{"last-modified"}) {
                utime $fh, $time, $time;
            }

            if ($status == 200 || $status == 206 || $status == 416) {
                # download ok || resume ok || file already fully downloaded
                $cb->(1, $hdr);

            } elsif ($status == 412) {
                # file has changed while resuming, delete and retry
                unlink $file;
                $cb->(0, $hdr);

            } elsif ($status == 500 or $status == 503 or $status =~ /^59/) {
                # retry later
                $cb->(0, $hdr);

            } else {
                $cb->(undef, $hdr);
            }
        }
    );
}

my $quit = AnyEvent->condvar; #create a handle to exit the loop

download $url, $file, sub {
    if ($_[0]) {
        print "OK!\n";
    } elsif (defined $_[0]) {
        print "please retry later\n";
    } else {
        print "ERROR\n";
    }
    $quit->send; #quit the loop
};

$quit->recv; #start the loop

The key to making it work is the $quit conditional variable:

my $quit = AnyEvent->condvar; #handle to exit the loop
.
.
.
$quit->recv;

This sets up an event loop. Without the event loop the program exits before the call to http_get gets a chance to do anything but create the file. To exit the event loop we call $quit->send in the callback to the download function.


Tried it (2 times) and it worked.

#!/usr/local/bin/perl
use warnings;
use 5.014;
use utf8;
use LWP::UserAgent;
use File::Basename;

my $url = 'http://server/somelargefile';
my $file = basename $url;
my $ua = LWP::UserAgent->new( show_progress => 1 );

open my $fh, '>>:raw', $file or die $!; 
my $bytes = -s $file;
my $res;       
if ( $bytes ) { 
    say "resume download: $file ($bytes)";
    $res = $ua->get( 
        $url,
        'Range' => "bytes=$bytes-",
        ':content_cb' => sub { my ( $chunk ) = @_; print $fh $chunk; }
    );
} else {
    say "start download";
    $res = $ua->get( 
        $url,
        ':content_cb' => sub { my ( $chunk ) = @_; print $fh $chunk; } 
    );
}
close $fh;

my $status = $res->status_line;
if ( $status =~ /^(200|206|416)/ ) {
    say "OK" if $status =~ /^20[06]/;
    say "$file already downloaded" if $status =~ /^416/;
} else {
    say $status;
}


I found this with some help from Google (needed specific search terms, not saying GIYF). Link to Google Translate.

The example code they have there is

use strict;
use LWP::UserAgent;

my $u = "http://www.mangafox.com/media/manga.banner.png";
my $f = "tmp.jpg"; 

my $ua = LWP::UserAgent->new();

## Uncomment for test
# unlink $f;
# system("cp tmp.jpg tmp0.jpg");
# system( "head -c 10000 tmp0.jpg > tmp.jpg" );

download($u, $f);

sub download  {
    my ($url, $file) = @_;

    my ($tries, @parameters, $FD);

    @parameters = ( $url,                  
                    ":content_cb" => sub { my ($chunk) = @_;
                                           print $FD $chunk;
                                         } 
                  ); 
    $tries = 4;

    while ( $tries ) {

        open($FD, ">>$file") || die "ERROR: $!";      

        my $bytes = -s $file;        
        if ( $bytes > 0 ) { push(@parameters, "Range" => "bytes=$bytes-" ) }

        my $res =$ua->get( @parameters );
        print $res->status_line . "\n";
        close $FD;

        # 416 Requested Range Not Satisfiable
        # (file already fully downloaded)

        if ( $res->is_success || $res->code == 416 ) { return }        
        $tries --;
    }

    die "ERROR: download $url";   
}

# Test with ImageMagick
system("identify tmp.jpg");

I am not saying this works or is right for you. Just something I found. Use at your own risk.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜