开发者

perl XML Convert Solution

I'm a beginner for Perl and CPAN Modules

I w开发者_运维技巧anna convert a xml file include:

<Item><Link>http://example.com/</Link></Item>....

To

<Item><Link>http://mysite.com/</Link></Item>....

Do you have smart solutions ? with CPAN Module


  • see XML::Twig - A perl module for processing huge XML documents in tree mode.
  • or XML::Simple - Easy API to maintain XML (esp config files)

like,

use strict;
use warnings; 
use XML::Simple;
use Data::Dumper;

my $xml = q~<?xml version='1.0'?>
<root>
  <Item>
  <Link>http://example.com/</Link>
  </Item>
  <Item>
   <Link>http://example1.com/</Link>
  </Item>
</root>~;

print $xml,$/;

my $data = XMLin($xml);

print Dumper( $data );

foreach my $test (@{$data->{Item}}){
   foreach my $key (keys %{$test}){
       $test->{$key} =~ s/example/mysite/;
   }
}
 print XMLout($data, RootName=>'root', NoAttr=>1,XMLDecl => 1);

output:

<?xml version='1.0'?>
<root>
  <Item>
  <Link>http://example.com/</Link>
  </Item>
  <Item>
   <Link>http://example1.com/</Link>
  </Item>
</root>
$VAR1 = {
          'Item' => [
                    {
                      'Link' => 'http://example.com/'
                    },
                    {
                      'Link' => 'http://example1.com/'
                    }
                  ]
        };
<?xml version='1.0' standalone='yes'?>
<root>
  <Item>
    <Link>http://mysite.com/</Link>
  </Item>
  <Item>
    <Link>http://mysite1.com/</Link>
  </Item>
</root>


A simple solution using XML::Twig is below. Compared with the XML::Simple option it works no matter where the Link elements are in the XML, and it will respect the original formatting of the file. It will also work if the XML contains mixed-content.

If you need to change the file in place, you can use parsefile_inplace instead of parsefile, and I suspect the regular expression in subs_text may need to be improved in real life, but this code should be a good starting point.

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

XML::Twig->new( twig_roots => { Link => \&replace_link, }, # process Link
                twig_print_outside_roots => 1,             # output everything else
              )
          ->parsefile( 'my.xml');

sub replace_link
  { my( $t, $link)= @_;
    $link->subs_text( qr{^http://example\.com/$}, 'http://mysite.com');
    $t->flush;               # or $link->print, outputs the modified (or not) link
  }           


If all you need is changing a specific value, you don't really need anything special, you can simply use regexp:
from command line :

perl -pi -e 's@http://example.com/@http://mysite.com/@g' file.xml

edit : adding full code version :

my $file = '/tmp/test.xml';

open IN, "<$file" or die "can't open $file $!";
open OUT, ">$file.tmp" or die "can't open $file.tmp $!";
foreach (<IN>) {
    s@http://example.com/@http://mysite.com/@g;
    print OUT $_;
}
close(IN);
close(OUT);

rename("$file.tmp", "$file")
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜