开发者

XPath function normalize-space() and starts-with() don't find etem with   right befor item

I have piece of such HTML code

<td>
<a href="CMenu?op=m&menuID=42&rurl={{rurl}}" title="Edit menu">
<img border="0" alt="" src="Images/application.gif">
&nbsp;Case
</a>
</td>

and need to find text "Case".

I use different XPath Queries, but no one is good, all find nothing:

//a[text() = ' Case']
//a[text() = 'Case']
//td/a[normalize-space(text()) = 'Case']
//td[a[normalize-space(.) = 'Case']]
//td[a[normalize-space(text()) = 'Case']]
//*[starts-with(.," Case")]
//*[starts-with(.,"Case")]

But, when I try to find this item with

//a开发者_开发问答[contains(.,'Case') and string-length() = 5]

It works, but I can not accept hard code '5', I want make this XPath multipurpose for other relative items.

And if I use

//a[contains(.,'Case')]

I find a lists of items which contains 'Case', but need only one with 'Case' and not more chars. Maybe I do something wrong, just want take it clear to me.


My colleague tell me that I can use this one XPath

//a[substring(text(),2)='Case']

just skip this space. I find out that it's work! but seems to me there is another way to solve my problem with out any skip or pass.


Use:

/td/a
      [starts-with(normalize-space(), '&#xA0;Case')]

XPath isn't entities-aware and normalize-space() by definition only processes white-space characters: space, NL, CR and Tab -- so &#xA0; (which is to what the entity &nbsp; is expanded) isn't considered a whitespace character.

XSLT - based verification:

This transformation selects with the above XPath expression and outputs the result:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
  <xsl:copy-of select=
  "/td/a
      [starts-with(normalize-space(), '&#xA0;Case')]
   "/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<td>
<a href="CMenu?op=m&amp;menuID=42&amp;rurl={{rurl}}" title="Edit menu">
<img border="0" alt="" src="Images/application.gif"/>
&#xA0;Case
</a>
</td>

the wanted, correct result is produced:

<a href="CMenu?op=m&amp;menuID=42&amp;rurl={{rurl}}" title="Edit menu">
  <img border="0" alt="" src="Images/application.gif" />
 Case
</a>


You might also want to try //a[matches(., '^\W*Case\W*$')] which will match a elements with value "Case" possibly surrounded by some other non-word characters (anything but letters, numbers and _ I think).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜