开发者

Perl Mechanize timeout not working with https

I've been using Perl's Mechanize library but for some reason with https the timeout parameter (I'm using Crypt::SSLeay for S开发者_运维问答SL).

my $browser = WWW::Mechanize->new(autocheck=>0, timeout=>3);

Has anyone encountered this before and knows how to fix it? Thanks!


For HTPS/SSL you have to do some workaround:

my $html = `wget -q -t 1 -T $timeout -O - $url`;
mech->get(0); 
$mech->update_html($html);


In just testing it now against https://www.sourceforge.net/, I get the impression that the timeout argument does work, but that it doesn't work until after the HTTPS negotiation occurs. I set the timeout really low, to a fractional value, and it reports a timeout correctly, but there is a delay much longer than my timeout value, and then it immediately returns with a timeout error.

Example:

#!/usr/bin/perl

use strict;
use warnings;
$|=1;

# This "works", downloading the page within the timeout period
use WWW::Mechanize;
my $mech = WWW::Mechanize->new(
    timeout => 3,
);
$mech->get( 'https://www.sourceforge.net/' );
print "Successful get.\n";

# This throws a connect timeout, but after a delay much longer than 50ms
my $mech2 = WWW::Mechanize->new(
    timeout => 0.05,
);
$mech2->get( 'https://www.sourceforge.net/' );
print "Successful get 2.\n";

Output:

Successful get.
Error GETing http://sourceforge.net/: Can't connect to sourceforge.net:80
(connect: timeout) at ./throwaway22855.pl line 20

It appears the timeout is handled deep down below in IO::Socket, using select. On some systems, this may interfere with SIGALRM, so if you want to work around this and write your own timeout, make sure you read your platform's implementation docs. Also note (in perldoc perlipc) that Perl has used deferred signals since 5.8.x, so setting an alarm by hand may not work without using the sigprocmask workaround.

There is some more information here: SIGALRM Timeout -- How does it affect existing operations?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜