handling carriage return in canonicalization with java
I am trying to canonicalize an html text node by com/sun/org/apache/xml/internal/security/c14n/Canonicalizer.java
class. My input file has carriage return and a line feed at the end. Upon canonicalization I expect to see the carriage return transformed into 
. However, the the output I get does not contain the carriage return. It only contains the line feed. How should I modify my code to include the carriage return?
example: my i开发者_StackOverflow中文版nput with cr
and lf
at the end
<MyNode xmlns="http://www.artsince.com/test#">Lqc3EeJlyY45bBm1lha869dkHWw1w+U8A6aKM2Xuwk3yWTjt0A2Wq/25rAncSBQlBGOCyTmhfic9(crlf)
9mWf4mC2Ui6ccLqCMjFR4mDQApkfoTy+Cu2eHul9CRjKa0TqckFv7ryda9V5MHruueXII/V+gPLT(crlf)
c76LsetK8C1434K66+Q=</MyNode>
this is the sample code I use
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new FileInputStream(new File("C:\\text.xml")));
if(!Init.isInitialized())
{
Init.init();
}
Path xPath = XPathFactory.newInstance().newXPath();
String expression = "child::*/child::text()";
NodeList textNodeList = (NodeList) xPath.evaluate(expression, doc, XPathConstants.NODESET);
Canonicalizer cn = Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C14N_OMIT_COMMENTS);
byte[] canonn = cn.canonicalizeXPathNodeSet(textNodeList);
System.out.println(new String(canonn).toCharArray());
and the output I get has only lf
in the end
Lqc3EeJlyY45bBm1lha869dkHWw1w+U8A6aKM2Xuwk3yWTjt0A2Wq/25rAncSBQlBGOCyTmhfic9(lf)
9mWf4mC2Ui6ccLqCMjFR4mDQApkfoTy+Cu2eHul9CRjKa0TqckFv7ryda9V5MHruueXII/V+gPLT(lf)
c76LsetK8C1434K66+Q=
however, I expect to see 
and lf
at the end of lines
Lqc3EeJlyY45bBm1lha869dkHWw1w+U8A6aKM2Xuwk3yWTjt0A2Wq/25rAncSBQlBGOCyTmhfic9
(lf)
9mWf4mC2Ui6ccLqCMjFR4mDQApkfoTy+Cu2eHul9CRjKa0TqckFv7ryda9V5MHruueXII/V+gPLT
(lf)
c76LsetK8C1434K66+Q=
XML defines that the input can contain all possible kinds of EOL styles but that the parser must replace all of them with a single linefeed (\n
, ASCII 10) character.
If you want to protect the character, you must replace ASCII 13 with
yourself before the XML parser sees the input. If you use Java, I suggest to use a FilterInputStream
.
try
static {
System.setProperty("org.apache.xml.security.ignoreLineBreaks", "true");
org.apache.xml.security.Init.init();
}
using the library: org.apache.santuario:xmlsec:3.0.1
, it works fine
if you are using org.apache.xml.security:xml-security:1.4.1
, it seems not work.
精彩评论