开发者

Xalan XPathAPI traverse question

I'm trying to parse a xhtml file using xalan XPathAPI. I'm stuck at the following requirement. Here's a snippet from the xhtml

<table border="0" cellspacing="0" cellpadding="0" class="cmnt_message">
            <tr>
              <td width="33" align="right">
                <span class="cmnt_baloon"><!-- Image --></span>
              </td>
              <td width="767" class="red pad_l_10">
                Posted by Macha on Mar 06, 2011 at 01:02 PM
              </td>

            </tr>
            <tr>
              <td colspan="2" class="cmnt_text">
                @rmaytee<br />
                <br />
                #2<br />
                <br />
                In 2011 it is possible to switch to old mat/map browser<br />

                <br />
                Just look around<br />
                <br />
                <a target="_blank" href=
                "http://area.autodesk.com/forum/autodesk-3ds-max/autodesk-3ds-max--3ds-max-design-2011/material-editor/">area.autodesk.com/forum/autodesk-3ds-max/autodesk-3ds-max--3ds-max-design-2011/material-editor/</a><br />

                <br />
                <br />
                <br />

              </td>
            </tr>
          </table>
          <table border="0" cellspacing="0" cellpadding="0" class="cmnt_message">
            <tr>
              <td width="33" align="right">
                <span class="cmnt_baloon"><!-- Image --></span>
              </td>
              <td width="767" class="red pad_l_10">

                Posted by rmaytee on Mar 02, 2011 at 06:04 PM
              </td>
            </tr>
            <tr>
              <td colspan="2" class="cmnt_text">
                2 things:<br />
                <br />
                1- Please bring back "use object center as start snap point" in the snap settings. We have voiced our opinion about this, now please show us you care. <a target="_blank" href=
                "http://www.the-area.com/forum/autodesk-3ds-max/autodesk-3ds-max--3ds-max-design-2011/use-object-center-as-start-snap-point">www.the-area.com/forum/autodesk-3ds-max/autodesk-3ds-max--3ds-max-design-2011/use-object-center-as-start-snap-point</a><br />

                <br />
                2- Make the Material/Map Browser the way it used to be. It is SO SLOW. At least make an option to switch to a "classic Material/Map Browser" or something.
              </td>
            </tr>
          </table>

I'm facing couple of issues here.

  1. I'm trying to extract the values of cmnt_message class. One is the "Posted By..." text under first block and the text content under cmnt_text. Here's the xpath for the first posted by part

/html:html/html:body//html:div[@class='content_d']/html:table[@class='cmnt_message']/html:tr[1]/html:td[2]/text()

This returns me "Posted by Macha on Mar 06, 2011 at 01:02 PM" which is what I want. But when I'm trying to get the cmnt_text with the following xpath expression

/html:html/html:开发者_JS百科body//html:div[@class='content_d']/html:table[@class='cmnt_message']/html:tr[2]/html:td/text()

I'm getting "@rmaytee" i.e. the value till first

. I'm trying to get the entire text inside cmnt_text.

  1. Other problem is I need to iterate through the cmnt_message and create a collection of Message object which consists of posted by and comment. Not sure how to iterate using Xpath.

    SAX2DOM sax2dom = new SAX2DOM(); p.setContentHandler(sax2dom); p.parse(new InputSource(urlXML.openStream())); Node doc = sax2dom.getDOM(); XObject comment = XPathAPI.eval(doc,commentPath);

But this returns me only the first occurrence of the cmnt_message class.

Any pointer will be highly appreciated.

  • Thanks


What you want is called string value.

If your XPath engine support string data type as result, you can use:

string(
    /html:html
       /html:body
          //html:div[@class='content_d']
             /html:table[@class='cmnt_message']
                /html:tr[1]
                   /html:td[2]
)

Or select the td element and use the proper DOM method.

It's not good to use text nodes for a mixed content model like XHTML.


You have to either use XSL or iterate over the child nodes of /html:html/html:body//html:div[@class='content_d']/html:table[@class='cmnt_message']/html:tr[2]/html:td yourself.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜