Fixing of an XPath predicate for use in XML::Twig
I'm trying to write a subroutine in Perl that will delete a given node in XML when provided with the text values of some of the children nodes.
Given XML like:
<Path>
<To>
<My>
<Node>
<ChildA>ValA</ChildA>
<ChildB>ValB</ChildB>
<ChildC>ValC</ChildC>
</Node>
</My>
</To>
</Path>
<!-- A lot of siblings follow... -->
The XPath expression I'm using is essentially:
/Path/To/My/Node[ChildA="ValA" and ChildB="ValB" and ChildC="ValC"]
When I'm trying to run my script, I'm getting an error like:
Error in XPath expression
/Path/To/My/Node[ChildA="ValA" and ChildB="ValB" and ChildC="ValC"] at
ChildA="ValA" and ChildB="ValB" and ChildC="ValC" at Twig.pm line 3353
I'm at a loss for this and am looking for suggestions开发者_运维问答. I've tried googling around, but I can't find working examples of trying to use predicates like this in XML::Twig
. I don't know if the problem is in my XPath syntax or how I'm using XML::Twig
.
For good measure, I've also tried:
/Path/To/My/Node[ChildA/text()="ValA" and ChildB/text()="ValB" and ChildC/text()="ValC"]
No luck with that either. What is the solution?
Within the test, Node
is the context node, so you have to say:
/Path/To/My/Node[./ChildA="ValA" and ./ChildB="ValB" and ./ChildC="ValC"]
This works for me in a short test program that uses XML::XPath
.
EDIT: Sorry, I'm not so familiar with XML::Twig, and I made an incorrect assumption about its XPath capabilities. According to the documentation, it supports only an "XPath-like" syntax that doesn't rise to the level of complexity of your example. However, if you use XML::Twig::XPath
instead of XML::Twig
, you get the full XPath engine:
my $twig = XML::Twig::XPath->new;
$twig->parse('your string');
my $nodes = $twig->findnodes('/Path/To/My/Node[ChildA="ValA" and ChildB="ValB" and ChildC="ValC"]');
print $nodes;
This prints "ValAValBValC".
There are 2 ways to do this: by loading the whole XML and deleting the nodes you don't want, then outputting the twig, or by filtering as you go along, which is a little more complex but uses less memory.
The first way (you may need a recent version of XML::XPathEngine, I haven't tested it with older ones or with XML::XPath, which can also act as the XPath engine)
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig::XPath;
my $t= XML::Twig::XPath->new( pretty_print => 'indented')
->parse( \*DATA);
$_->delete for ($t->findnodes( '/Path/To/My/Node[./ChildA="ValA" and ./ChildB="ValB" and ./ChildC="ValC"]'));
$t->print;
__DATA__
<Path>
<To>
<My>
<Node>
<ChildA>ValA</ChildA>
<ChildB>ValB</ChildB>
<ChildC>ValC</ChildC>
</Node>
<Node>
<ChildA>ValD</ChildA>
<ChildB>ValB</ChildB>
<ChildC>ValC</ChildC>
</Node>
</My>
</To>
</Path>
And the "filter" way:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
XML::Twig->new( twig_roots => { '/Path/To/My/Node' => \&filter },
twig_print_outside_roots => 1,
keep_spaces => 1,
)
->parse( \*DATA);
exit;
# the handler expressions cannot lookahead, so we need to look at each node
# once it's completely parsed
sub filter
{ my( $t, $node)= @_;
if( ($node->field( 'ChildA') eq 'ValA')
&& ($node->field( 'ChildB') eq 'ValB')
&& ($node->field( 'ChildC') eq 'ValC')
)
{ $node->delete; }
else
{ $t->flush; }
}
__DATA__
<Path>
<To>
<My>
<Node>
<ChildA>ValA</ChildA>
<ChildB>ValB</ChildB>
<ChildC>ValC</ChildC>
</Node>
<Node>
<ChildA>ValD</ChildA>
<ChildB>ValB</ChildB>
<ChildC>ValC</ChildC>
</Node>
</My>
</To>
</Path>
精彩评论