How to get only the top level node's text content with getTextContent()
I'm trying to get just the top level text and none of the child text. So I have the following xml:
<job>
text1
<input> text2 </input>
</job>
and I would like to only get the parent(text1) text. So in this example I would do
node.getTextContent();
and get text1
, not text1text2
which getTextContent is currently giving me. Now I've read the man pages and 开发者_JAVA技巧I know they say that getTextContent returns the concatenated string of all the children with the parent. But I would just like the text from the parent. Another way I was thinking about doing it was to try and isolate the parent from the children and do the getTextContent command on just the parent but I don't know how feasible that is.
Any help would be appreciated
Thanks, -Josh
Iterate through all the children of the node and concatenate those that are text nodes. Either that or XPath.
Does getChildNodes() work? if so you could loop over all the childNodes and call getContent() on them, and subtract that out of your node.getContent(). This would result in the text that isn't part of a sub-node.
Best answer: don't mix text with sub-nodes. I had to double-check that the xml you provided is even legal, it is, but it scares me.
I think you could probably use an xpath of job/text() this might be easier than navigating the DOM model.
If you can, avoid mixed content, its a bit of a pain to work with.
Instead of this
node.getTextContent();
use this:
if (node.getFirstNode() != null)
{
node.getFirstChild().getTextContent();
}
node.firstChild.textContent.trim();
If anyone is having problems with this the best way I found to do it was to get all the child nodes of the node and then get the node type of each child node. If you get a text node call getTextContent() on that node and there you go!
精彩评论