开发者

java: how to parse html-like xml

I have an html-like xml, basically it is html. I need to get the elements in each . Each element looks like this:

<line tid="744476117">  <attr>1414</attr>  <attr>31</attr><attr class="thread_title">title1</attr><attr>author1</attr><attr>date1</attr></line>

My code is as below, it does recognize that there are 50 in the file, but it gives me NULLPointException when parsing NodeList fstNmElmntLst = fstElmnt.getElementsByTagName("attr");

Any idea why this is happening? The same code has been used for other applications without problems.

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(cleanxml));
Document doc = db.parse(is);                
doc.getDocumentElement().normalize();
System.out.println("Root element " + doc.getDocumentElement().getNodeName());
NodeList nodeLst = doc.getElementsByTagName("line");
for (int s = 0; s < nodeLst.getLength(); s++) {
System.out.println(nodeLst.getLength());
Node fstNode = nodeLst.item(s);
if (fstNode.getNodeType() == Node.ELEMENT_NODE) {
                                                Element fstElmnt = (Element) fstNode;
    NodeList fstNmElmntLst = fstElmnt.getElementsByTagName("attr");
    Element fstNmElmnt = (Element) fstN开发者_如何学CmElmntLst.item(0);
         NodeList fstNm = fstNmElmnt.getChildNodes();
    System.out.println("attr : "  + ((Node) fstNm.item(0)).getNodeValue());
 }
 }


problem solved! One of the <line> does not have any <attr> which causes this problem!!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜