Whats a simple Perl script to parse a HTML document with custom tags(Perl interpreter)?
Ok this is what I'm doing. I'm making a perl interpreter for documents that end in my custom extension(.cpm). I have looked around and found
- http://perlmeme.org/tutorials/html_parser.html
- https://metacpan.org/pod/HTML::TokeParser::Simple
- http://www.justskins.com/forums/html-parser-8489.html
It seems that HTML::Parser is the way to go. What I am asking for is a simple tutorial to parse a document with special tags. For example I would like something that shows me how to parse a HTML document but whenever the <putinbold>
is ecountered it replaces it with <b>
.
An example of what I want-
<html>
This is HTML talking
<liamslanguage>say "This is Liams language speakin开发者_运维百科g"</liamslanguage>
</html>
The important part of parsing with HTML::Parser
is to assign the right handlers
with the right argspec
. A sample program:
#!/usr/bin/env perl
use strict;
use warnings;
use HTML::Parser;
my $html;
sub replace_tagname {
my ( $tagname, $event ) = @_;
if ( $tagname eq 'liamslanguage' ) {
$tagname = 'b';
}
if ( $event eq 'start' ) {
$html .= "<$tagname>";
}
elsif ( $event eq 'end' ) {
$html .= "</$tagname>";
}
}
my $p = HTML::Parser->new(
'api_version' => 3,
'start_h' => [ \&replace_tagname, 'tagname, event' ],
'default_h' => [ sub { $html .= shift }, 'text' ],
'end_h' => [ \&replace_tagname, 'tagname, event' ],
);
$p->parse( do { local $/; <DATA> } );
$p->eof();
print $html;
__DATA__
<html>
This is HTML talking
<liamslanguage>say "This is Liams language speaking"</liamslanguage>
</html>
精彩评论