开发者

XPath _relative_ to given element in HTMLUnit/Groovy?

I would like to evaluate an XPath expression relative to a given element.

I have been reading here: htt开发者_运维知识库p://www.w3schools.com/xpath/default.asp

And it seems like one of the syntaxes below should work (esp no leading slash or descendant:)

However, none seem to work in HTMLUnit. Any help much appreciated (oh this is a groovy script btw). Thank you!

http://htmlunit.sourceforge.net/

http://groovy.codehaus.org/

Misha


#!/usr/bin/env groovy

import com.gargoylesoftware.htmlunit.WebClient

def html="""
<html><head><title>Test</title></head>
<body>
<div class='levelone'>
 <div class='leveltwo'>
    <div class='levelthree' />
 </div>
 <div class='leveltwo'>
    <div class='levelthree' />
    <div class='levelthree' />
 </div>
</div>

</body>
</html>
"""

def f=new File('/tmp/test.html')
if (f.exists()) {
 f.delete()
}
def fos=new FileOutputStream(f)
fos<<html

def webClient=new WebClient()
def page=webClient.getPage('file:///tmp/test.html')

def element=page.getByXPath("//div[@class='levelone']")
assert element.size()==1
element=page.getByXPath("div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("/div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("descendant:div[@class='levelone']") // this
gives namespace error
assert element.size()==0

Thank you!!!


It is not clear from the definition of the problem, what is the element relative to which the XPath expressions are evaluated. Assuming that this is the document node, then the following XPath expressions will select the desired node:

   */*/div[@class='levelone']

   html/body/div[@class='levelone']

   descendant::div[@class='levelone']

You may have problem if in the actual XML document (not shown), there is a default namespace. In this case you need to define / register this namespace in your XPath-hosting language (I don't know groovy) and use the associated prefix, like this:

   */*/x:div[@class='levelone']

   x:html/x:body/x:div[@class='levelone']

   descendant::x:div[@class='levelone']


Thank you so much. Apparently my error was using a single semicolon after descendant rather than two (doh)

#!/usr/bin/env groovy

import com.gargoylesoftware.htmlunit.WebClient

def html="""
<html><head><title>Test</title></head>
<body>
<div class='levelone'>
  <div class='leveltwo'>
     <div class='levelthree' />
  </div>
  <div class='leveltwo'>
     <div class='levelthree' />
     <div class='levelthree' />
  </div>
</div>

</body>
</html>
"""

def f=new File('/tmp/test.html')
if (f.exists()) {
  f.delete()
}
def fos=new FileOutputStream(f)
fos<<html

def webClient=new WebClient()
def page=webClient.getPage('file:///tmp/test.html')

def element=page.getByXPath("//div[@class='levelone']")
assert element.size()==1
element=page.getByXPath("div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("/div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("descendant::div[@class='levelone']")
assert element.size()==1

Doh!

Thank you!

Misha

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜