Looking for a way to scrape urls from a page and output it to a text file
I am looking for a way to scrape URLS from a web page and output it to a text file.
E.g开发者_如何学编程 if a page contains multiple http://example.com/article
I want to grab both these URLS and output it to a text file.
Have a look at WWW::Mechanize.
Example code:
use strict;
use warnings;
use 5.010;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
$mech->get('http://example.com/example');
foreach my $link ($mech->find_all_links()) {
say $link->url_abs();
}
Use HTML::SimpleLinkExtor:
use strict;
use warnings;
use HTML::SimpleLinkExtor;
my $extor = HTML::SimpleLinkExtor->new();
$extor->parse_url('http://example.com/article');
my @links = $extor->absolute_links();
精彩评论