Looping with Conditions through a xml file using lxml etree
I wonder if it's possible to make conditional statments connected with 开发者_运维技巧a tree.findall("...") statement in lxml library?
I have the following xml structure in a file
<sss version="1.2">
<date>2011-09-23</date>
<time>12:32:29</time>
<origin>OPST</origin>
<user></user>
<survey>
<name>Test</name>
<version>2011-09-02 15:50:10</version>
<record ident="A">
<variable ident="10" type="quantity">
<name>no_v</name>
<label>Another question</label>
<position start="23" finish="24"/>
<values>
<range from="0" to="32"/>
</values>
</variable>
<variable ident="11" type="quantity">
<name>v_683</name>
<label>another totally another Question</label>
<position start="25" finish="26"/>
<values>
<range from="0" to="33"/>
</values>
</variable>
<variable ident="12" type="quantity">
<name>v_684</name>
<label>And once more Question</label>
<position start="27" finish="29"/>
<values>
<range from="0" to="122"/>
</values>
</variable>
<variable ident="20" type="single">
<name>v_684</name>
<label>Question with alternatives</label>
<position start="73" finish="73"/>
<values>
<range from="1" to="6"/>
<value code="1">Alternative 1</value>
<value code="2">Alternative 2</value>
<value code="3">Alternative 3</value>
<value code="6">Alternative 4</value>
</values>
</variable>
</record>
</survey>
</sss>
What I want to do now is to get only survey/record/variable/name .text and survey/record/variable/values/value .text if the name starts with "v_"
So far i have the first part
from lxml import etree as ET
tree = ET.parse('scheme.xml')
[elem.text for elem in tree.getiterator(tag='name') if elem.text.startswith('v_')]
But how can I get the survey/record/variable/values/value .text of the SAME element...and use survey/record/variable/name .text like a filter? Thanks a lot!
[(elem.text,elem.getparent().xpath('values/value/text()'))
for elem in tree.getiterator(tag='name') if elem.text.startswith('v_')]
yields
[('v_683', []),
('v_684', []),
('v_684',
['Alternative 1', 'Alternative 2', 'Alternative 3', 'Alternative 4'])]
elem
is a name
element. So to get the the associated values, you can first find its parent (variable
), then search for the values
child, and then the value
subchild elements.
An alternative that removes the getparent
call, but uses a slightly more complicated XPath is:
[(elem.text,elem.xpath('following-sibling::values/value/text()')) for elem in tree.getiterator(tag='name') if elem.text.startswith('v_')]
following-sibling::
tells xpath
to generate all siblings of name
.
following-sibling::values
tells xpath
to generate all siblings of name
that are values
elements.
精彩评论