开发者

Looping with Conditions through a xml file using lxml etree

I wonder if it's possible to make conditional statments connected with 开发者_运维技巧a tree.findall("...") statement in lxml library?

I have the following xml structure in a file

    <sss version="1.2">
    <date>2011-09-23</date>
    <time>12:32:29</time>
    <origin>OPST</origin>
    <user></user>
    <survey>
        <name>Test</name>
        <version>2011-09-02 15:50:10</version>
        <record ident="A">
            <variable ident="10" type="quantity">
                <name>no_v</name>
                <label>Another question</label>
                <position start="23" finish="24"/>
                <values>
                    <range from="0" to="32"/>
                </values>
            </variable>
            <variable ident="11" type="quantity">
                <name>v_683</name>
                <label>another totally another Question</label>
                <position start="25" finish="26"/>
                <values>
                    <range from="0" to="33"/>
                </values>
            </variable>
            <variable ident="12" type="quantity">
                <name>v_684</name>
                <label>And once more Question</label>
                <position start="27" finish="29"/>
                <values>
                    <range from="0" to="122"/>
                </values>
            </variable>
            <variable ident="20" type="single">
                <name>v_684</name>
                <label>Question with alternatives</label>
                <position start="73" finish="73"/>
                <values>
                    <range from="1" to="6"/>
                    <value code="1">Alternative 1</value>
                    <value code="2">Alternative 2</value>
                    <value code="3">Alternative 3</value>
                    <value code="6">Alternative 4</value>
                </values>
            </variable>
        </record>
    </survey>
</sss>

What I want to do now is to get only survey/record/variable/name .text and survey/record/variable/values/value .text if the name starts with "v_"

So far i have the first part

from lxml import etree as ET
tree = ET.parse('scheme.xml')
[elem.text for elem in tree.getiterator(tag='name') if elem.text.startswith('v_')]

But how can I get the survey/record/variable/values/value .text of the SAME element...and use survey/record/variable/name .text like a filter? Thanks a lot!


[(elem.text,elem.getparent().xpath('values/value/text()')) 
 for elem in tree.getiterator(tag='name') if elem.text.startswith('v_')]

yields

[('v_683', []),
 ('v_684', []),
 ('v_684',
  ['Alternative 1', 'Alternative 2', 'Alternative 3', 'Alternative 4'])]

elem is a name element. So to get the the associated values, you can first find its parent (variable), then search for the values child, and then the value subchild elements.


An alternative that removes the getparent call, but uses a slightly more complicated XPath is:

[(elem.text,elem.xpath('following-sibling::values/value/text()')) for elem in tree.getiterator(tag='name')  if elem.text.startswith('v_')]

following-sibling:: tells xpath to generate all siblings of name.

following-sibling::values tells xpath to generate all siblings of name that are values elements.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜