开发者

How to get a list of concatenated text nodes

My purpose is to request on a xml structure, using only one XPath evaluation, in order to get a list of strings containing the concatenation of text3 and text5 for each "my_class" div.

The structure example is given below:

 <div>   
     <div>
         <div class="my_class">
             <div class="my_class_1"></div>
             <div class="my_class_2">text2</div>
             <div class="my_class_3">
                 text3
                 <div class="my_class_4">text4</div>
                 <div class="my_class_5">text5</div>
             </div>
         </div>
         <div class="my_class_6"></div>   
     </div>   
     <div>开发者_JAVA技巧;
         <div class="my_class">
             <div class="my_class_1"></div>
             <div class="my_class_2">text12</div>
             <div class="my_class_3">
                 text13
                 <div class="my_class_4">text14</div>
                 <div class="my_class_5">text15</div>
             </div>
         </div>   
     </div>  
 </div>

This means I want to get this list of results:

- in index 0 => text3 text5

- in index 1 => text13 text15

I currently can only get the my_class nodes, but with the text12 that I want to exclude ; or a list of each string, not concatened.

How I could proceed ?

Thanks in advance for helping.

EDIT : I remove text4 and text14 from my search to be exact in my example


EDIT: Now the question has changed...

XPath 1.0: There is no such thing as "list of strings" data type. You can use this expression to select all the container elements of the text nodes you want:

/div/div/div[@class='my_class']/div[@class='my_class_3']

And then get with the proper DOM method of your host language the string value of every of those selected elements (the concatenation of all descendant text nodes) the descendat text nodes you want and concatenate their string value with the proper relative XPath or DOM method:

text()[1]|div[@class='my_class_5']

XPath 2.0: There is a sequence data type.

/div/div/div[@class='my_class']
           /div[@class='my_class_3']
              /concat(text()[1],div[@class='my_class_5'])


Could you not just use:

//my_class/my_class_3 

And then get the .innerText from that? There might be a bit of spacing cleanup to do but it should contain all the inside text (including that from the class 4 and 5) but without the tags.


Edit: After clairification

concat(/div/div/div[@class=my_class]/div[@class=my_class_3]/text(), ' ', /div/div/div[@class=my_class]/div[@class=my_class_5]/text())

That might work

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜