开发者

How to reformat XML programmatically?

I have an XML document on input which is awfully formatted (it's Delphi project file if anyone cares) - inconsistent indenting, empty lines, strings of nodes lumped together:

<BorlandProject><Delphi.Personality><Parameters><Parameters Name="HostApplication">C:\Some\Path\Filename.exe</Parameters> <!--etc--> <Excluded_Packages>


</Excluded_Packages>

I want to reformat it into something nice. What's the easiest way to do that programmatically, with Win32/COM? If MSXML, how do I go abou开发者_JS百科t it?

I'd like to be able to specify indentation unit too (tab/several spaces).

I tried using Delphi's MSXML wrapper TXmlDocument and it does indeed delete the empty lines and indent nodes with tabs, but it does not split lines like this one:

<BorlandProject><Delphi.Personality><Parameters><Parameters Name="HostApplication">C:\Some\Path\Filename.exe</Parameters> <!--etc--> <Excluded_Packages>


I tested the FormatXMLData function in a delphi project file and works ok, indent all the lines correctly.

check this code.

uses
 XMLIntf,
 XMLDoc;

Procedure FormatXMLFile(const XmlFile:string);
var
   oXml : IXMLDocument;
 begin
   oXml := TXMLDocument.Create(nil);
   oXml.LoadFromFile(XmlFile);
   oXml.XML.Text:=xmlDoc.FormatXMLData(oXml.XML.Text);
   oXml.Active := true;
   oXml.SaveToFile(XmlFile);
 end;


I used Tidy to format XML. RRUZ's method using xmlDoc.FormatXMLData works very well, and it makes sense to use it, but if your XML files happen to be big, then it may not work so well. When I tried to format a 100 MB, single-line XML file, the application crashed with an out-of-memory error on a 4GB machine, and it was very slow as well.

I used the command line version of tidy. There is also a DLL version, and there is a Delphi header file for that that you can hunt down, but I found it more convenient to run the exe via CreateProcess rather than learn the DLL API.

This is the command line I used:

tidy.exe -xml -wrap 0 -indent -quiet -o outFile.xml inFile.xml

tidy.exe is stand-alone, you don't need the DLL or anything else.

Other possibilities for formatting XML are xmllint and xml starlet.

I couldn't get xmllint to run at all, but I'm sure I could have if I had persisted.

xml starlet seemed to work well, but it didn't have any option to write to a file, only to stdout, so I didn't use that because I would have had to work out how to capture the output.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜