How do I fetch and parse HTML with Perl?

2023-01-01 03:18 问答作者：

How do I do the following in Perl in order: a) curl a page and save it to a variable b) parse the value of the variable (which is HTML content) for values开发者_开发百科 I want (ex: the info is kept between tags like ... )

My perl kung-fu is rusty, but I believe it's something along following lines.

To fetch something using curl and then extract for example contents of some html element:

use WWW::Curl::Easy;
my $curl = new WWW::Curl::Easy;
$curl->setopt(CURLOPT_URL, 'http://www.example.com/some-url.html');
open (my $fileb, ">", \$response_body);
$curl->setopt(CURLOPT_WRITEDATA, $fileb);
$curl->perform;
my $info = $curl->getinfo(CURLINFO_HTTP_CODE);

$response_body =~ m|<a[^>]+>(.+?)</a>|;

Now, $1 should contain contents of A element. If it doesn't, it will say that $1 is undefined or something similar. You should first check in $info that status code is as expected, of course. This being Perl code, it's ugly this way, but it works. However, I recommend not doing this often (and especially not in bigger scripts), as it's certainly fastest road to shooting yourself in the foot with Perl:

You shoot yourself in the foot, but nobody can understand how you did it. Six months later, neither can you.

I hope it helps.

P.S. I am sure that there is some easier way around, without this much code, but I can't remember how it goes...

继续阅读：curl perl

How do I fetch and parse HTML with Perl?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？