开发者

Whats a simple Perl script to parse a HTML document with custom tags(Perl interpreter)?

Ok this is what I'm doing. I'm making a perl interpreter for documents that end in my custom extension(.cpm). I have looked around and found

  • http://perlmeme.org/tutorials/html_parser.html
  • https://metacpan.org/pod/HTML::TokeParser::Simple
  • http://www.justskins.com/forums/html-parser-8489.html

It seems that HTML::Parser is the way to go. What I am asking for is a simple tutorial to parse a document with special tags. For example I would like something that shows me how to parse a HTML document but whenever the <putinbold> is ecountered it replaces it with <b>.

An example of what I want-

<html>

This is HTML talking

<liamslanguage>say "This is Liams language speakin开发者_运维百科g"</liamslanguage>

</html>


The important part of parsing with HTML::Parser is to assign the right handlers with the right argspec. A sample program:

#!/usr/bin/env perl

use strict;
use warnings;

use HTML::Parser;

my $html;
sub replace_tagname {
    my ( $tagname, $event ) = @_;

    if ( $tagname eq 'liamslanguage' ) {
        $tagname = 'b';
    }

    if ( $event eq 'start' ) {
        $html .= "<$tagname>";
    }
    elsif ( $event eq 'end' ) {
        $html .= "</$tagname>";
    }
}

my $p = HTML::Parser->new(
    'api_version' => 3,
    'start_h'     => [ \&replace_tagname,      'tagname, event' ],
    'default_h'   => [ sub { $html .= shift }, 'text'           ],
    'end_h'       => [ \&replace_tagname,      'tagname, event' ],
);
$p->parse( do { local $/; <DATA> } );
$p->eof();

print $html;

__DATA__
<html>
This is HTML talking
<liamslanguage>say "This is Liams language speaking"</liamslanguage>
</html>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜