Android: Parsing XML DOM parser. Converting childnodes to string
Again a question. This time I'm parsing XML messages I receive from a server. Someone thought to be smart and decided to place HTML pages in a XML message. Now I'm kind of facing problems because I want to extract that HTML page as a string from this XML message.
Ok this is the XML message I'm parsing:
<AmigoRequest>
<From></From>
<To></To>
<MessageType>showMessage</MessageType>
<Param0>general message</Param0>
<Param1><html><head>test</head><body>Testhtml</body></html></Param1>
</AmigoRequest>
You see that in Param1 a HTML page is specified. I've tried to extract the message the following way:
public String getParam1(Document d) { if (d.getDocumentElement().getTagName().equals("AmigoRequest")) { Nod开发者_高级运维eList results = d.getElementsByTagName("Param1"); // Messagetype depends on what message we are reading. if (results.getLength() > 0 && results != null) { return results.item(0).getFirstChild().getNodeValue(); } } return ""; }
Where d is the XML message in document form. It always returns me a null value, because getNodeValue() returns null. When i try results.item(0).getFirstChild().hasChildNodes() it will return true because he sees there is a tag in the message.
How can i extract the html message <html><head>test</head><body>Testhtml</body></html>
from Param0 in a string?
I'm using Android sdk 1.5 (well almost java) and a DOM Parser.
Thanks for your time and replies.
Antek
You could take the content of param1, like this:
public String getParam1(Document d) {
if (d.getDocumentElement().getTagName().equals("AmigoRequest")) {
NodeList results = d.getElementsByTagName("Param1");
// Messagetype depends on what message we are reading.
if (results.getLength() > 0 && results != null) {
// String extractHTMLTags(String s) is a function that you have
// to implement in a way that will extract all the HTML tags inside a string.
return extractHTMLTags(results.item(0).getTextContent());
}
}
return "";
}
All you have to do is to implement a function:
String extractHTMLTags(String s)
that will remove all HTML tag occurrences from a string. For that you can take a look at this post: Remove HTML tags from a String
after checking a lot and scratching my head thousands of times I came up with simple alteration that it needs to change your API level to 8
EDIT: I just saw your comment above about getTextContent()
not being supported on Android. I'm going to leave this answer up in case it's useful to someone who's on a different platform.
If your DOM API supports it, you can call getTextContent()
, as follows:
public String getParam1(Document d) {
if (d.getDocumentElement().getTagName().equals("AmigoRequest")) {
NodeList results = d.getElementsByTagName("Param1");
// Messagetype depends on what message we are reading.
if (results != null) {
return results.getTextContent();
}
}
return "";
}
However, getTextContent()
is a DOM Level 3 API call; not all parsers are guaranteed to support it. Xerces-J does.
By the way, in your original example, your check for null
is in the wrong place; it should be:
if (results != null && results.getLength() > 0) {
Otherwise, you'd get a NPE if results
really does come back as null
.
Since getTextContent()
isn't available to you, another option would be to write it -- it isn't hard. In fact, if you're writing this solely for your own use -- or your employer doesn't have overly strict rules about open source -- you could look at Apache's implementation as a starting point; lines 610-646 seem to contain most of what you need. (Please be respectful of Apache's copyright and license.)
Otherwise, some rough pseudocode for the method would be:
String getTextContent(Node node) {
if (node has no children)
return "";
if (node has 1 child)
return getTextContent(node.getFirstChild());
return getTextContent(new StringBuffer()).toString();
}
StringBuffer getTextContent(Node node, StringBuffer sb) {
for each child of node {
if (child is a text node) sb.append(child's text)
else getTextContent(child, sb);
}
return sb;
}
Well i was almost there with the code...
public String getParam1(Document d) {
if (d.getDocumentElement().getTagName().equals("AmigoRequest")) {
NodeList results = d.getElementsByTagName("Param1");
// Messagetype depends on what message we are reading.
if (results.getLength() > 0 && results != null) {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db;
Element node = (Element) results.item(0); // get the value of Param1
Document doc2 = null;
try {
db = dbf.newDocumentBuilder();
doc2 = db.newDocument(); //create new document
doc2.appendChild(doc2.importNode(node, true)); //import the <html>...</html> result in doc2
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
Log.d(TAG, " Exception ", e);
} catch (DOMException e) {
// TODO: handle exception
Log.d(TAG, " Exception ", e);
} catch (Exception e) {
// TODO: handle exception
e.printStackTrace(); }
return doc2. .....// All I'm missing is something to convert a Document to a string.
}
}
return "";
}
Like explained in the comment of my code. All I am missing is to make a String out of a Document. You can't use the Transform class in Android... doc2.toString() will give you a serialization of the object..
But my next step is write my own parser if this doesnt work out ;)
Not the best code but a temponary solution.
public String getParam1(String b) {
return b
.substring(b.indexOf("<Param1>") + "<Param1>".length(), b.indexOf("</Param1>"));
}
Where String b is the XML document string.
精彩评论