开发者

Grab xPath content without surrounding markup

How do you grab the content of xPath without copying the surrounding mark?

<div id="node-123" class="clearfix">
                    <div class="content">
                        <div class="body">
                            <p><img src="/images/image.jpg"/></p>
                         开发者_运维知识库   <p>Some content ....</p>
                        </div>    
                    </div>
                </div>

If I used //div[@id='node-123']/div/div, I still get surrounding <div class="body"> which is not expected.

What I want is the content of <div class="body">, excluding this <div class="body"> markup, but reserving other markups inside the content, p, img, etc.

I tried to use wildcard: //div[@id='node-123']/div/div/*, but this only fetch the first p, where p can be two or many. Using node() fetch nothing.

Any hint would be very much appreciated.

Thanks


If I used //div[@id='node-123']/div/div, I still get surrounding <div class="body"> which is not expected.

What I want is the content of <div class="body">, excluding this <div class="body"> markup, but reserving other markups inside the content, p, img, etc.

Use:

//div[@id='node-123']/div/div/node()

This selects all nodes (elements, text-nodes, processing-instructions and comment-nodes) that are children of any div element that is a child of any div element that is a child of any div element in the document such that the value of its id attribute is 'node-123'.

Warning: It is always a good practice not to use the // pseudo-operator if the structure of the XML document is statically known. Using the // pseudo -operator results most-often in very slow performance, causing complete tree traversal.


The problem is unterminated img tag at actual original article: <img src="/images/image.jpg"> rather than <img src="/images/image.jpg"/>.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜