开发者

Confused by which XML processing option to use

I'm fairly new to Python, and I've just started working with XML parsing. I am getting a bit overwhelmed by all the options for working with XML, and I'm hoping an experienced person can give me some advice (and perhaps a code sample??) for the simple problem I'm working on.

I am working on a simple Python contact management application that does not involve a database - each contact's information is stored in a separate text file using XML. For example, assume the following is the contents of the file "1234.xml"

<contact>
<id>1234</id>
<name>Johnny Appleseed</name>
<phone>8145551212</phone>
<address>
    开发者_运维技巧<street>1234 Main Street</street>
    <city>Hometown</city>
    <state>OH</state>
</address>
<address>
    <street>1313 Mockingbird Lane</street>
    <city>White Plains</city>
    <state>NY</state>
</address>
</contact>

For sake of example, let's assume there can be only one phone number, but multiple address blocks.

For what I'm doing here, I need to be able to parse the XML from the file, make changes to the data, and then update the XML and save it back to the file. Let's assume there are three types of data changes that might occur:

  1. changing the data for one or more items, such as updating a phone number

  2. adding a new address block (and the corresponding data for the street/city/state of the new address)

  3. deleting an existing address block

Given what I'm trying to do here, can you recommend a particular way of doing this? (SAX, DOM, minidom, ElementTree, something else?) Code samples for whatever you suggest would be greatly appreciated.

Thank you!

Ron


The SAX and DOM APIs are older; they were pretty much translated from the Java world into Python. The ElementTree API was designed specifically to be Pythonic, i.e. to fit with the Python way of problem solving, so prefer that.

The richest and fastest implementation of ElementTree that I know of is lxml. It's XPath functionality is very useful. Untested example:

from lxml import etree

contacts = etree.parse(open("1234.xml"))

for c in contacts.xpath('//contact'):
    if c.xpath('/name')[0].text == 'Johnny Appleseed':
        c.xpath('/phone')[0].text = NEW_PHONE_NUMBER

print >> open("1234.xml", "w"), etree.tostring(contacts)


The best solution is to use ElementTree and parse this into a set of classes and manipulate the classes and then serialize them back to XML. You can do this by hand if the XML is really as simple as your example or you can use some tool or library to generate the bindings.

Working directly with XML in most cases always ends in tears, or at least hair pulling. It also isn't very maintainable, when the XML changes it usually will break your hand coded parsing.

Using a binding solution will be more robust to changes and easier to modify when you do need to manually intervene.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜