Java: problems while parsing a XML file
I have the following XML and I'm trying to print the value of some nodes. For example, with the following code I want to print the
NodeList list = doc.getElementsByTagName("photo");
element = (Element)list.item(0);
list开发者_高级运维 = element.getChildNodes();
System.out.println(list.item(0).getNodeName());
System.out.println(list.item(0).getNodeValue());
and I get
null
#text
instead of "title" and "bigfish live 200812"
What am I doing wrong ? thanks
<?xml version="1.0" encoding="utf-8" ?>
<rsp stat="ok">
<photo id="2882550369" secret="21054282c8" server="3106" farm="4" dateuploaded="1222202793" isfavorite="0" license="0" safety_level="0" rotation="0" views="5" media="photo">
<owner nsid="64878451@N00" username="fishthemusic" realname="masayoshi yamamiya" location="kawasaki, japan" iconserver="4" iconfarm="1" />
<title>bigfish live 200812</title>
<description>photo by Kazuhiro Nakamura</description>
<visibility ispublic="1" isfriend="0" isfamily="0" />
<dates posted="1222202793" taken="2008-09-24 05:46:33" takengranularity="0" lastupdate="1222998937" />
<editability cancomment="1" canaddmeta="0" />
<publiceditability cancomment="1" canaddmeta="0" />
<usage candownload="1" canblog="1" canprint="0" canshare="1" />
<comments>0</comments>
<notes />
<tags>
<tag id="314160-2882550369-80673" author="64878451@N00" raw="bigfish" machine_tag="0">bigfish</tag>
<tag id="314160-2882550369-5558" author="64878451@N00" raw="live" machine_tag="0">live</tag>
<tag id="314160-2882550369-29726586" author="64878451@N00" raw="upcoming:event=1167424" machine_tag="1">upcoming:event=1167424</tag>
</tags>
<urls>
<url type="photopage">http://www.flickr.com/photos/fishthemusic/2882550369/</url>
</urls>
</photo>
</rsp>
There is text node in your photo
element, marked as XXXX in example below. You're getting this text node. Note that there may be multiple adjacent text nodes. You need to find first node with Element type to get your owner
element.
<photo ...>XXXX
XXXX<owner nsid="64878451@N00" ... />
Try this instead:
NodeList list = doc.getElementsByTagName("photo");
element = (Element)list.item(0);
list = element.getChildNodes();
int ix = 0;
while (ix < list.getLength() && list.item(ix).getNodeType() != Node.ELEMENT_NODE) {
ix++;
}
// now ix points to your first element node (if there was one)
System.out.println(list.item(ix).getNodeName());
System.out.println(list.item(ix).getNodeValue());
Btw, "nodeValue" of element is null
, so you should see
owner
null
as output. See also http://download.oracle.com/javase/6/docs/api/org/w3c/dom/Node.html for details. (It also shows that #text
is nodeName of text nodes, exactly what you are getting).
Because when you call element.getChildNodes()
you are getting ALL children of that Element, which includes attributes like id
and secret
. So list.item(0)
is an attribute, which is why you aren't getting the results you expect.
getNodeName()
returns null
because attributes have no node name
getNodeValue()
returns #text
because the value of an attribute is a Text Node, which in turn holds the string value of that attribute.
Also, please don't redefine the same variable (e.g. list
) to re-use for something completely different. It's really bad practice.
Using JAXB instead of SAX/DOM will make this rocket science much easier. See this article for explanations.
First, write the equivalent XSD schema (you can omit unwanted nodes); here is a good start:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="photo">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="owner" type="owner" />
<xsd:element name="title" type="xsd:string" />
</xsd:sequence>
<xsd:attribute name="id" type="xsd:int" />
<xsd:attribute name="secret" type="xsd:string" />
</xsd:complexType>
</xsd:element>
<xsd:complexType name="owner">
<xsd:attribute name="nsid" type="xsd:string" />
</xsd:complexType>
</xsd:schema>
Secondly, generate classes from this using this maven plugin : http://mojo.codehaus.org/jaxb2-maven-plugin/usage.html.
Then, write some code (and add JAXB maven dependency to your project):
public class JaxbTest {
@Test
public void should_parse_recipe() throws JAXBException {
URL xmlUrl = Resources.getResource("file.xml");
Photo recipe = parse(xmlUrl, Photo.class);
assertEquals(Integer.valueOf(15), recipe.getCooking().getDuration());
}
private <T> T parse(URL url, Class<T> clazz) throws JAXBException {
Unmarshaller unmarsh = JAXBContext.newInstance(clazz).createUnmarshaller();
return clazz.cast(unmarsh.unmarshal(url));
}
}
ps. Resources.getResource
is from guava; using Thread.currentThread().getContextClassLoader().getSystemResource("file.xml") instead works
精彩评论