Large Xml files are being truncated by MSXML4 / FreeThreadedDOMDocument40 (COM string Interop issue)
I'm using the following code to load a large Xml document (~5 MB):
int _tmain(int argc, _TCHAR* argv[])
{
::CoInitialize(NULL);
HRESULT hr;
CComPtr< IXMLDOMDocument > spXmlDocument;
hr = spXmlDocument.CoCreateInstance(__uuidof(FreeThreadedDOMDocument60)), __uuidof(FreeThreadedDOMDocument60);
if(FAILED(hr)) return FALSE;
spXmlDocument->put_preserveWhiteSpace(VARIANT_TRUE);
spXmlDocument->put_async(VARIANT_FALSE);
spXmlDocument->put_validateOnParse(VARIANT_FALSE);
VARIANT_BOOL bLoadSucceeded = VARIANT_FALSE;
hr = spXmlDocument->load( CComVariant( L"C:\\XMLFile1.xml" ), &bLoadSucceeded );
if(FAILED(hr) || bLoadSucceeded==VARIANT_FALSE) return FALSE;
CComVariant bstrDoc;
hr = spXmlDocument->get_nodeValue(&bstrDoc);
CComPtr< IXMLDOMNode > spNode;
hr = spXmlDocument->selectSingleNode(CComBSTR(L"//SpecialNode"), &spNode );
}
I'm finding that the contents of bstrDoc is truncated (there are no exceptions / failed HResults)
Anyone know why? You can try this yourself just by creating a large Xml file of just <xml></xml>
elements (~5 MB should do it)
UPDATE: Updating to use MSXML 6 made no difference, also setting Async to false and using get_nodeValue / get_text made no difference (sample updated)
I noticed that if I did selectSingleNode for a node placed at the end of the document it worked fine - it appears that the document loads successfully, and the issue is instead with getting the text for a single node. I'm perplexed however as I'm yet to find anyone else on the internet having this issue.
UPDATE 2: The problem appears to be related to COM interop itsel开发者_运维百科f - I've created a simple C# class that does the same thing and exposed it as a COM object. I can see that although the Xml is fine in my C# app, by the time I look at it in my debugger in the C++ app it looks exactly as it did when using MSXML.
It appears I was a victim of my own foolishness - the Xml / strings were in fact not being truncated, the viewer in Visual Studio was simply lying to me.
Outputting the strings to a file showed that the strings were all as they should be.
精彩评论