Select elements with unique values
I'm trying to parse an OpenOffice spreadsheet to obtain rows with unique values in the first column.
I.E., I would like to retrieve from the following XML fragment all <table:table-row>
elements with unique <text:p>
values in the first child <table:table-cell>
.
<table:table table:name="foo">
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-cell>
<table:table-cell>
<text:p>foo</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>2</text:p>
</table:table-cell>
<table:table-cell>
<text:p>bar</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-cell>
<table:table-cell>
<text:p>baz</text:p>
</table:table-cell>
</table:table-row>
</table:table>
I'll like to get the below output as Nodes
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-c开发者_开发百科ell>
<table:table-cell>
<text:p>foo</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>2</text:p>
</table:table-cell>
<table:table-cell>
<text:p>bar</text:p>
</table:table-cell>
</table:table-row>
How can I do this with XPath?
This XPath produces desired output:
/table:table/table:table-row[not(./table:table-cell[1]/text:p/text() = preceding-sibling::table:table-row/table:table-cell[1]/text:p/text())]
Pure XPath should be:
/table:table/table:*[not(
.//text:p[1]
= preceding-sibling::table:table-row//text:p[1]
)]
If with expected output you mean a sequence of table:row
nodes and not an xml document as someone correctly notice in the comments.
/table:table/table:*[not(
./table:*[1]//text:*[1]
= preceding-sibling::table:*/table:*[1]/text:*[1]
)]
精彩评论