开发者

Java contextual sax / stax parsing

I would like to extract all text elements which appear directly as a child node to the root node. I've had a glance at java standard sax fascilities using DefaultHandler; but it doesn't seem like it's path aware.

The problem is getting first-level only nodes, not extracting only text-nodes.

Is there any non-DOM oriented approach to do this? (Note, the node names are not known in advance)

[EDIT]

Sample input

<root>
   <a>text1</a>
   <b>text2</b>
   <c>text3</c>开发者_开发技巧
   <nested>
       <d>not_text4</d>
       ...
   <nested>
   ...
</root>

Sample output

Map<String, String> map := {
    {a, text1}
    {b, text2}
    {c, text3}
}

Currently solved as a DOM oriented workaround. Although there exist libraries which offers a subset of xpath expressions for SAX / STAX.


SAX and StAX indeed aren't path aware by nature as they're event oriented. While it's certainly possible to implement a handler that tracks parsing level, you're probably better off with XPath.

A somewhat more complex tactic might be to write an XSLT transform that retains only the elements you're after and then process the result using SAX or Stax.


This will be a little overhead, but you get a powerful tool to work with xml. Try JAXB.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜