开发者

Xpath getting node without node child contents

hey guys coudln't get around this. I have an html structured as follow:

<div class="review-text">
<div id开发者_如何学C="reviewerprofile">
<div id="revimg"></div>
<div id="reviewr">marc</div>
<div id="revdate">2011-07-06</div>
</div>
this is an awesome review

</div>

what i am trying to get is just the text "this is an awesome review" but everytyme i query the node i also get the other content in the childs. using something like this now ".//div[@class='review-text']" how to get just that text only? tank you very much


You're almost there! Just add /text() at the end of your XPath to get the text node.


An XPath expression such as //div returns a set of nodes, in this case div elements. These are in effect pointers to the original nodes in the original tree; the nodes are still connected to their parents, children, ancestors, and siblings. If you see the children of the div element and don't want them, that's not the fault of the XPath processor, it's the fault of whatever software is processing the results returned by the XPath expression.

You can get the text that's an immediate child of the div element by using /text() as suggested. However, that assumes that you know exactly what you are expecting to find in the HTML page - if "awesome" were in italics, it would give you something different.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜