XML Canonicalizer Problem
I'm using the package org.开发者_如何学Goapache.xml.security.c14n
for the canonicalization of XMLs. I use the following code:
private String CanonicalizeXML(String XML) throws InvalidCanonicalizerException, CanonicalizationException, ParserConfigurationException, IOException, SAXException {
Canonicalizer canon = Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C14N_OMIT_COMMENTS);
return new String(canon.canonicalize(XML.getBytes()));
}
However, it doesn't seem to work as I expected, since it doesn't delete any non-necessary white spaces between elements. Do I do something wrong?
Thanks,
Ivan
I think it may be your expectation which is incorrect:
You don't say which version of XML Canonicalization, but both 1.0 and 1.1 say:
All whitespace in character content is retained (excluding characters removed during line feed normalization)
Is your xml document referencing a dtd or schema? Without one of those the parser has no way to know which whitespace is significant and so it has to preservere it.
The org.apache.xml.security.c14n does not remove whitespaces.
I resolved by setting setIgnoringBoundaryWhitespace = true on my SAXBuilder:
SAXBuilder builder = new SAXBuilder ();
builder.setIgnoringBoundaryWhitespace(true);
org.jdom2.Document doc = builder.build(is);
DOMOutputter out = new DOMOutputter();
Document docW3 = out.output(doc);
精彩评论