Alter XML while preserving layout
What would you use to alter an XML-file while preserving as much as possible of layout, including indentation and comments?
My problem is that I have a couple of massive hand-edited XML-files describing a user interface, and now I need to translate several attributes to another language.
I've tried doing this using Python + ElementTree, but it did not preserve neither whitespace nor comments.
I've seen XSLT being suggested for similar questions, but I don't think that is an alternative in this case, since I need to do some logic and lookups for each attribute.
It would be 开发者_JS百科preferable if attribute order in each element is preserved as well, but I can tolerate changed order.
Any DOM manipulation module should suite your needs. Layout is just a text data, so it's represented as text nodes in DOM:
>>> from xml.dom.minidom import parseString
>>> dom = parseString('''\
... <message>
... <text>
... Hello!
... </text>
... </message>''')
>>> dom.childNodes[0].childNodes
[<DOM Text node "u'\n '">, <DOM Element: text at 0xb765782c>, <DOM Text node "u'\n'">]
>>> text = dom.getElementsByTagName('text')[0].childNodes[0]
>>> text.data = text.data.replace(u'Hello', u'Hello world')
>>> print dom.toxml()
<?xml version="1.0" ?><message>
<text>
Hello world!
</text>
</message>
If you use an XSLT processor such as xt, then you can write extension methods in Java that can perform any arbitrary transformation you need.
Having said that, I have used Python's xml.dom.minidom module successfully for this sort of transformation. It does preserve whitespace and layout.
精彩评论