MSXML's loadXML fails to load even well formed xml
I have written a wrapper on top of MSXML in c++ . The load method looks like as below. The problem with the code is it fails to load well formed xml sometimes.
Before passing the xml as string I do a string search for xmlns and replace all occurrence of xmlns with xmlns:dns. In the code below I remove bom character. Then i try to load using the MSXML loadXML method . If load succeeds I set the namespace as shown in the code.
Class XmlDocument{
MSXML2::IXMLDOMDocument2Ptr spXMLDOM;
....
}
// XmlDocument methods
void XmlDocument::Initialize()
{
CoInitialize(NULL);
HRESULT hr = spXMLDOM.CreateInstance(__uuidof(MSXML2::DOMDocument60));
if ( FAILED(hr) )
{
throw "Unable to create MSXML:: DOMDocument object";
}
}
bool XmlDocument::LoadXml(const char* xmltext)
{
if(spX开发者_如何学CMLDOM != NULL)
{
char BOM[3] = {0xEF,0xBB,0xBF};
//detect unicode BOM character
if(strncmp(xmltext,BOM,sizeof(BOM)) == 0)
{
xmltext += 3;
}
VARIANT_BOOL bSuccess = spXMLDOM->loadXML(A2BSTR(xmltext));
if ( bSuccess == VARIANT_TRUE)
{
spXMLDOM->setProperty("SelectionNamespaces","xmlns:dns=\"http://www.w3.org/2005/Atom\"");
return true;
}
}
return false;
}
I tried to debug still could not figure why sometimes loadXML() fails to load even well formed xmls. What am I doing wrong in the code. Any help is greatly appreciated.
Thanks JeeZ
For this specific issue, please refer to Strings Passed to loadXML must be UTF-16 Encoded BSTRs.
Overall, xml parser is not designed for in memory string parsing, e.g. loadXML does not recognize BOM, and it has restriction on the encoding. Rather, an xml parser is designed for byte array form with encoding detection, which is critical for a standard parser. To better leverage MSXML, please consider loading from IStream or a Win32 file.
I'm not a fan of A2BSTR - at the very least you're leaking memory as the returned BSTR is never deallocated.
You could just as easily
VARIANT_BOOL bSuccess = spXMLDOM->loadXML(CComBSTR(xmltext));
Which will handle the memory properly.
As to why its failing - You can ask the DOMDocument for its parseError object IXMLDOMParseError and then fetch the reason from it - that will probably shed more light on what the real problem is.
We use
hr = m_pXMLDoc->load(_variant_t(xml_file.c_str()), &varStatus);
hr = m_pXMLDoc->loadXML(_bstr_t(xml_doc.c_str()), &varStatus);
For loading files and raw xml respectively.
精彩评论