开发者

How can I get the contents of a followed link in WWW::Mechanize?

This is my last question for this I hope. I am using $mech->follow_link to try to download a file. For some reason though the file saved is just the page I first pull up and not the link I want to follow. Is this the correct way I should download the file from the link? I do not want to use wget.

    #!/usr/bin/perl -w
    use strict;
    use LWP;
    use WWW::Mecha开发者_StackOverflow中文版nize;
    my $now_string = localtime;
    my $mech = WWW::Mechanize->new();
    my $filename = join(' ', split(/\W++/, $now_string, -1));
    $mech->credentials( '***********' , '************'); # if you do need to supply     server and realms use credentials like in [LWP doc][2]
$mech->get('http://datawww2.wxc.com/kml/echo/MESH_Max_180min/') or die "Error: failed to load the web page";
$mech->follow_link( url_regex => qr/MESH/i ) or die "Error: failed to download content";
$mech->save_content("$filename.kmz");


Steps to try

  1. First print the contents from your get, to make sure you're reaching a valid HTML page
  2. Make sure the link you're going to is the third link called "MESH" (case-sensitive?)
  3. Print the contents from your second get
  4. Print the filename to make sure it's wellformed
  5. Check that the file was created successfully

Additional

  • You don't need the unless in either case - it's going to work, or it's going to die

Example

#!/usr/bin/perl -w

use strict;
use WWW::Mechanize;

   sub main{
   
      my $url    =  qq(http://www.kmzlinks.com);
      my $dest   =  qq($ENV{HOME}/Desktop/destfile.kmz);
      
      my $mech   =  WWW::Mechanize->new(autocheck => 1);
      
      # if needed, pass your credentials before this call
      $mech->get($url);
      die "Couldn't fetch page" unless $mech->success;
      
      # find all the links that have urls to kmz files
      my @links  =  $mech->find_all_links( url_regex => qr/(?:\.|%2E)kmz$/i );
      
      foreach my $link (@links){               # (loop example)

         # use absolute URL path of the link to download file to destination
         $mech->get($link->url_abs, ':content_file' => $dest);
     
         last;                                 # only need one (for testing)
      }     
   }
   
   main();


Are you sure you want the 3rd link called 'MESH'?


Change if to unless.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜