How do I grab text of multiple tags in an xml feed using one xpath expression?

2023-02-02 01:25 问答作者：

I'm trying to parse an xml feed that looks something like this:

<item>
<title>item title</title>
<link>item link</link>
<description>item description</description>
</item>

I'm trying to find an xpath expression that will retrieve all the details of each item so that each item in the feed is contained within its own array or grouped in some way. I tried using //item/* but the tags are not grouped, although they are correctly ordered.

Is there anyway of doing that?

edit:

开发者_StackOverflow

[
[title1, link1, desc1],
[title2, link2, desc2],
[title3, link3, desc3]
]

From http://www.w3.org/TR/xpath/#section-Introduction

An expression is evaluated to yield an object, which has one of the following four basic types:

node-set (an unordered collection of nodes without duplicates)

boolean (true or false)

number (a floating-point number)

string (a sequence of UCS characters)

So, no "structure" data type like tuples. The "standar" solution for your task is to select the parents and iterate over them getting the children with any DOM API method.

With this input

<root>
<item>
    <title>item title</title>
    <link>item link</link>
    <description>item description</description>
</item>
<item>
    <title>item2</title>
    <link>link2</link>
    <description>description2</description>
</item>
</root>

And this xsl

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="//item">
        <xsl:value-of select="./title"/><xsl:text>
</xsl:text>
        <xsl:value-of select="./link"/><xsl:text>
</xsl:text>
        <xsl:value-of select="./description"/><xsl:text>
</xsl:text>
    </xsl:template>

</xsl:stylesheet>

You get this output

item title
item link
item description

item2
link2
description2

I hope this helped..

Here's an XPath 2.0 expression returning a sequence (assuming the XML input document from Stefanos' answer):

for $item in /root/item
  return ($item/title/text(), $item/link/text(), $item/description/text())

Sequences are ordered but do not allow nesting, so you cannot get exactly the kind of data structure you are asking for with pure XPath. With XSLT (or another host language), you can create new objects that provide the desired structure.

You haven't specified a language, but if you're using Python (which is what the data structure you presented looks like), it's easy enough to do using lxml:

 >>> from lxml import etree
 >>> d = etree.fromstring("""<doc>
 <item>
  <title>item 1 title</title>
  <link>item 1 link</link>
  <description>item 1 description</description>
 </item>
 <item>
  <title>item 2 title</title>
  <link>item 2 link</link>
  <description>item 2 description</description>
 </item>
</doc>""")
>>> [[e.xpath("title")[0].text,
      e.xpath("description")[0].text,
      e.xpath("link")[0].text]
     for e in d.xpath("/doc/item")]
[['item 1 title', 'item 1 description', 'item 1 link'], ['item 2 title', 'item 2 description', 'item 2 link']]

This isn't quite so easy to do in a list comprehension if the XML's structure is unreliable; the above breaks if there's an item element that doesn't have a 'link' child, for instance.

继续阅读：feed xml

How do I grab text of multiple tags in an xml feed using one xpath expression?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？