开发者

Converting MySQL TEXT field with breaklines to XML by Perl script returns a malformed notation

I have a table in MySQL that has one field defined as TEXT. The information is fed to the database by a webform using a textarea.

I'm using the following script to generate an XML with the information of the table:

#!/usr/bin/perl

use strict;
use DBI;
use XML::Generator::DBI;
use XML::Handler::YAWriter;

my $dbh = DBI->connect ("DBI:access info goes here",
                           { RaiseError => 1, PrintError => 0});
my $out = XML::Handler::YAWriter->new (AsFile => "-", Encoding=>"ISO-8859-1")开发者_如何学Go;
   my $gen = XML::Generator::DBI->new (
                                   Handler => $out,
                                   dbh => $dbh
                               );
   $gen->execute ("SELECT text FROM table");
   $dbh->disconnect ();

The problem is that when the text entered has breaklines it generates a malformed XML:

<text {http://axkit.org/NS/xml-generator-dbi}encoding="HASH(0x9c43ba0)">PHA+YWlqZHNvaWFqZG9pYXNqZG9pYXNqb2RpanNhaW9kanNhb2lkYXNvaWo8L3A+DQo8cD5zPC9w
Pg0KPHA+ZDwvcD4NCjxwPmFzPC9wPg0KPHA+ZHNhPC9wPg0KPHA+ZDwvcD4NCjxwPnNhZHNhZHNh
ZHM8L3A+DQo8cD4mbmJzcDs8L3A+DQo8cD5hc2Rhc2Rzc2FkZHNkc2FzZHNhPC9wPg0KPHA+Jm5i
c3A7PC9wPg0KPHA+YXNkZHNhZHNhYXNkc2Rhc2RhYXNkPC9wPg==
</text>

For example if the text entered is:

<p>One</p>
<p>Two</p>

It outputs the malformed XML, but when the text is:

<p>One</p> <p>Two</p>

It prints out the XML correctly.

Is there any way to 'strip' the breakline from the textarea or ignore it in the creation of the XML?

Thanks.


It might work to enforce well-formed-ness:

$text = s|(?i)(<br)>|$1 />|gm;

Which will turn any bare linebreak tag into an empty tag compliant with XML well-formed-ness.

With my cursory look at the classes you're using, it looks like if you can step into the handler chain, and handle, say characters, you might be able to do something likes this before the call to XML::Generator::DBI->execute.

$gen->set_content_handler(
    SAXHandlerWrapper->new(
        characters => sub { 
            s|(?i)(<br)>|$1 />|gm; 
            return $out->characters( $_ ) 
        }
    )    
);

Where the following behavior defines SAXHandlerWrapper:

package SAXHandlerWrapper;
use 5.010;
use strict;
use warnings;
use Carp         qw<croak>;
use Params::Util qw<_CODE _HASH _IDENTIFIER _INSTANCE>;
use Scalar::Util qw<blessed>;

sub _make_handler {
    my $name = shift || $_;
    return if __PACKAGE__->can( $name );
    no strict;
    *$name = sub {
        my $action = shift->{ $name };
        local $_ = $_[0];
        return &$action;
    }
}
sub new {
    my $self = bless {}, shift;
    my $current_name;
    @_ = %{ shift() } if &_HASH( $_[0] );
    while ( local $_ = shift @_ ) {
        given ( $_ ) {
            when ( !_IDENTIFIER( $_ )) {
                croak( "Invalid parameter name: $_!" );
            }
            when ( 'event' )   {
                croak( "Invalid event name: $_!" )
                    unless $current_name = _IDENTIFIER( shift )
                    ;
                _make_handler( $current_name );
            }
            when ( 'action' ) {
                croak( 'Action not code reference!' )
                    unless my $action = _CODE( shift )
                    ;
                croak( 'No active handler name!' ) unless $current_name;
                $self->{ $current_name } = $action;
            }
            default {
                croak( "Invalid event: $_!" )
                    unless $self->{ $_ } = _CODE( shift )
                    ;
                 _make_handler( $_ );
           }
        }
    }
    Carp::croak( 'Nothing handled!' ) unless %$self;
    foreach ( grep { !_CODE( $self->{$_} ) } keys %$self ) {
        Carp::croak( "Handler for $_ is not complete!" );
    }
    return $self;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜