开发者

Adding an incrementing value attribute to every tag in xml using script

I want to add an attribute to every tag in my xml, which is incrementing using either awk, 开发者_JAVA百科sed, perl or plain shell cmd

For Eg:

<tag1 key="123">
  <tag2 abc="xf d"/>
  <tag3 def="d2 32">
   </tag3>
</tag1>

I am expecting the following output

<tag1 key="123" order="1">
  <tag2 abc="xf d" order="2"/>
  <tag3 def="d2 32" order="3">
   </tag3>
</tag1>

If possible I am not looking on any dependencies(Twig,LibXML), pure string manipulation.


I like Perl's XML::Twig for this sort of thing. You'll have to adjust it for whatever you are doing so you visit all the elements you want to affect. To handle parents before children, a queue is probably what you want:

use XML::Twig;

my $xml = <<'XML';
<tag1 key="123">
  <tag2 key="1234"/>
  <tag3 key="12345">
   </tag3>
</tag1>
XML

my $twig = XML::Twig->new(
    pretty_print => 'indented',
    );
$twig->parse( $xml );
my @queue = ( $twig->root );

my $n = 1;  
while( my $elem = shift @queue ) {
    next unless $elem->tag =~ /\Atag[123]\z/;
    $elem->set_att( order => $n++ );
    push @queue, $elem->children( qr/\Atag/ );
    }

$twig->print;

The output from this script is:

<tag1 key="123" order="1">
  <tag2 key="1234" order="2"/>
  <tag3 key="12345" order="3"></tag3>
</tag1>


It's pretty simple with XML::LibXML and a drop of XPath.

#!/usr/bin/perl

use strict;
use warnings;

use XML::LibXML;

my $counter = 1;

my $xp = XML::LibXML->new->parse_file('test.xml');

foreach($xp->findnodes('//*')) { # '//*' returns all nodes
  $_->setAttribute('order', $counter++);
}

print $xp->toString;


Normally you should use a proper parser to process xml. But in awk:

awk 'match($0, /<[^\/>]+/) { \
     $0 = substr($0, 1, RSTART+RLENGTH-1) " order=\"" ++i "\"" \
          substr($0, RSTART+RLENGTH) \
     }; 1'

I look for a opening tag (without the > or /> part) on every the line. If found, put the string order="i" after it, while incrementing i. The single 1 on the last line just always executes awk's default action: { print $0 }.

I updated the regular expression to work on your revised input. It fails as soon as you have multiple opening tags on a single line, etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜