What module can I use to parse RSS feeds in a Perl CGI script?
I am trying to find a RSS parser that can be used with a Perl CGI script. I found simplepie
and that's really easy parser to use in PHP scripting. Unfortunately that doesn't work with a Perl CGI script. Please let me开发者_StackOverflow社区 know if there is anything that's easy to use like simplepie
.
I came across this one RssDisplay but I am not sure about the usage and also how good it is.
From CPAN: XML::RSS::Parser.
XML::RSS::Parser is a lightweight liberal parser of RSS feeds. This parser is "liberal" in that it does not demand compliance of a specific RSS version and will attempt to gracefully handle tags it does not expect or understand. The parser's only requirements is that the file is well-formed XML and remotely resembles RSS.
#!/usr/bin/perl
use strict; use warnings;
use XML::RSS::Parser;
use FileHandle;
my $parser = XML::RSS::Parser->new;
unless ( -e 'uploads.rdf' ) {
require LWP::Simple;
LWP::Simple::getstore(
'http://search.cpan.org/uploads.rdf',
'uploads.rdf',
);
}
my $fh = FileHandle->new('uploads.rdf');
my $feed = $parser->parse_file($fh);
print $feed->query('/channel/title')->text_content, "\n";
my $count = $feed->item_count;
print "# of Items: $count\n";
foreach my $i ( $feed->query('//item') ) {
print $i->query('title')->text_content, "\n";
}
Available Perl Modules
XML::RSS::Tools
XML::RSS::Parser:
#!/usr/bin/perl -w use strict; use XML::RSS::Parser; use FileHandle; my $p = XML::RSS::Parser->new; my $fh = FileHandle->new('/path/to/some/rss/file'); my $feed = $p->parse_file($fh); # output some values my $feed_title = $feed->query('/channel/title'); print $feed_title->text_content; my $count = $feed->item_count; print " ($count)\n"; foreach my $i ( $feed->query('//item') ) { my $node = $i->query('title'); print ' '.$node->text_content; print "\n"; }
XML::RSS::Parser::Lite (Pure Perl):
use XML::RSS::Parser::Lite; use LWP::Simple; my $xml = get("http://url.to.rss"); my $rp = new XML::RSS::Parser::Lite; $rp->parse($xml); print join(' ', $rp->get('title'), $rp->get('url'), $rp->get('description')), "\n"; for (my $i = 0; $i < $rp->count(); $i++) { my $it = $rp->get($i); print join(' ', $it->get('title'), $it->get('url'), $it->get('description')), "\n"; }
dirtyRSS:
use dirtyRSS; $tree = parse($in); die("$tree\n") unless (ref $tree); disptree($tree, 0);
精彩评论