开发者

Select elements with unique values

I'm trying to parse an OpenOffice spreadsheet to obtain rows with unique values in the first column.

I.E., I would like to retrieve from the following XML fragment all <table:table-row> elements with unique <text:p> values in the first child <table:table-cell>.

    <table:table table:name="foo">
        <table:table-row>
            <table:table-cell>
                <text:p>1</text:p>
            </table:table-cell>
            <table:table-cell>
                <text:p>foo</text:p>
            </table:table-cell>
        </table:table-row>
        <table:table-row>
            <table:table-cell>
                <text:p>2</text:p>
            </table:table-cell>
            <table:table-cell>
                <text:p>bar</text:p>
            </table:table-cell>
        </table:table-row>
        <table:table-row>
            <table:table-cell>
                <text:p>1</text:p>
            </table:table-cell>
            <table:table-cell>
                <text:p>baz</text:p>
            </table:table-cell>
        </table:table-row>
    </table:table>

I'll like to get the below output as Nodes

        <table:table-row>
            <table:table-cell>
                <text:p>1</text:p>
            </table:table-c开发者_开发百科ell>
            <table:table-cell>
                <text:p>foo</text:p>
            </table:table-cell>
        </table:table-row>
        <table:table-row>
            <table:table-cell>
                <text:p>2</text:p>
            </table:table-cell>
            <table:table-cell>
                <text:p>bar</text:p>
            </table:table-cell>
        </table:table-row>

How can I do this with XPath?


This XPath produces desired output: /table:table/table:table-row[not(./table:table-cell[1]/text:p/text() = preceding-sibling::table:table-row/table:table-cell[1]/text:p/text())]


Pure XPath should be:

 /table:table/table:*[not(
  .//text:p[1]
   = preceding-sibling::table:table-row//text:p[1]
 )]

If with expected output you mean a sequence of table:row nodes and not an xml document as someone correctly notice in the comments.

 /table:table/table:*[not(
  ./table:*[1]//text:*[1]
   = preceding-sibling::table:*/table:*[1]/text:*[1]
 )]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜