How to validate very large XML files?
How 开发者_运维百科can I validate a large XML file (>100mb)? I try to open it with IE, FX & GC and it either crashes or doesn't do anything.
xmllint --stream
Worked on a 1.2Gb file with memory limited to 500Mb:
ulimit -Sv 500000
xmllint --stream a.xml
Without --stream
, Linux kills the process, and without ulimit
, my computer jams.
I was not able however to get output from --xpath
when using --stream
: How to do command line XPath queries in huge XML files?
Tested in Ubuntu 14.04, xmllint version 20901.
You can try using a command-line validator, for example xmlstarlet:
$ xmlstarlet validate bigfile.xml
The only tool I know that combines a large file viewer and an XML validator for huge files is XML ValidatorBuddy . The file viewer doesn't load the complete file at once but it is possible to scroll and there is also XML syntax-coloring applied. The validation uses the SAX parser from Xerces and your document with >100mb shouldn't be a problem.
Oxygen XML has a HUGE FILE support that does validation
http://www.oxygenxml.com/#14.1Huge_XML_Files_Support
You can also use the XML Tools Plugin in Nodepad++, it has a function "Check XML Syntax now". It's simple to download and if you don't use Notepad++ already, it's a good reason to start!
The following command worked for me xmllint --huge
In Java, and I'm sure in other languages, there are solutions for reading in an entire XML file and processing it as a complete DOM, and solutions that process the XML as a stream in an event-driven way. You would want the second kind of solution, which never loads the entire file in memory. See SAX for a Java solution to the problem.
You could try the EditiX XML editor.
If you load your document into EditiX and there are problems with the XML, eg. mismatched opening and closing tags, the editor will still load the file and in the bottom right corner of the screen you'll see a number displayed in red eg. a red "5" means there are five errors in the document.
I've not tried a 100mb document but I've done over 15mb and it seemed quite happy.
There's a free version.
in addition to dj_segfault's comment on phihag's answer, xmlstarlet is fortunately NOT dead. They've just released Version 1.3
If you want a decent commandlinetool that can manipulate xml, xmlstarlet is perfect (and pretty fast).
Windows Version of XML Starlet:
> xml val <xmlfile.xml>
Liquid Studio Community Edition contains a Large File Editor which can also be used to validate XML files. Its not really got an upper limit on the size of the files you can open Terra-byte files open instantly on low spec machines, and its free.
On Windows you can write a simple application based on .net platform. The System.Xml.XmlReader
class is capable of validating huge files. An example is in this answer: Validating an XML against referenced XSD in C#
精彩评论