How to change namespace
How to change namespace and uri to new one with perl? Files are big (20 MB) and containing one line and the structures are complicated. Example:
<?xml version="1.0" encoding="utf-8"?>
<开发者_Python百科;m:sr xmlns:m="http://www.example.com/mmm" xml:lang="et">
<m:A m:AS="EX" m:KF="sss1">
<m:m m:u="uus" m:O="ggg">ggg</m:m>
</m:A>
</m:sr>
To:
<?xml version="1.0" encoding="utf-8"?>
<a:sr xmlns:a="http://www.example.com/aaa" xml:lang="et">
<a:A a:AS="EX" a:KF="sss1">
<a:m a:u="uus" a:O="ggg">ggg</a:m>
</a:A>
</a:sr>
You can do this with XML::Twig.
The code below is quite clean in that it makes very few assumptions about the input, notably it relies on the URI of the namespace in the input, not on the prefix. The twig is flushed every time any element is done parsing, so it keeps very little in memory.
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
# parameters, may be changed
my $SR = 'sr'; # local name of the root of the fragment
my $OUT = 'a'; # prefix in the output
my $IN_NS = 'http://www.example.com/mmm'; # namespace URIs
my $OUT_NS = 'http://www.example.com/aaa';
my $t= XML::Twig->new(
map_xmlns => { $IN_NS => $OUT, $OUT_NS => $OUT, },
start_tag_handlers => { "$OUT:$SR" => \&change_ns_decl, },
twig_handlers => { _all_ => sub { $_->flush; }, },
keep_spaces => 1,
)
->parse( \*DATA); # replace by parsefile( "my.xml");
exit;
sub change_ns_decl
{ my( $t, $sr)= @_;
$sr->set_att( "xmlns:$OUT" => $OUT_NS);
}
__DATA__
<?xml version="1.0" encoding="utf-8"?>
<m:sr xml:lang="et" xmlns:m="http://www.example.com/mmm">
<m:A m:AS="EX" m:KF="sss1">
<m:m m:u="uus" m:O="ggg">ggg</m:m>
</m:A>
</m:sr>
You could run the XML through this XSLT to transform into the desired output:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:m="http://www.example.com/mmm"
xmlns:a="http://www.example.com/aaa"
exclude-result-prefixes="m">
<xsl:output indent="yes"/>
<!--identity template to copy content forward by default-->
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>
<!--Change any elements bound to the "m" namespace, to be in the "a" namespace-->
<xsl:template match="m:*">
<xsl:element name="a:{local-name()}" >
<xsl:apply-templates select="@*|node()"/>
</xsl:element>
</xsl:template>
<!--Change any attributes bound to the "m" namespace, to be in the "a" namespace-->
<xsl:template match="@m:*">
<xsl:attribute name="a:{local-name()}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Doing a global replace would easily work. Assuming you've loaded your file into one long string, the following would do this substitution:
my $current_namespace = "m";
my $new_namespace = "a";
$xml =~ s/\<$current_namespace:/\<$new_namespace:/g;
As you said the files are relatively large, therefore you may have to implement a streaming method. You could for instance, read the file line by line. As you read the file, convert each line using the method above and then write out to a temp file. When finished, delete the file on disk and rename the temp file to replace it.
精彩评论