开发者

xml missing element in python

System uses dom parser in python 2.7.2. The goal is to extract the .db file and use it on sql server.I 开发者_开发技巧currently have no problem with sqlite3 library. I have read the similar questions/answers about how to handle a missing element while parsing xml files.But still I couldn't figure out the solution. xml has 15000+ elements. here is the basic code from xml:

<topo>
   <vlancard>
      <id>4545</id>
      <nodeValue>21</nodeValue>
      <vlanName>voice</vlanName>
   </vlancard>
   <vlancard>
      <id>1234</id>
      <nodeValue>42</nodeValue>
      <vlanName>camera</vlanName>
   </vlancard>
   <vlancard>
      <id>9876</id>
      <nodeValue>84</nodeValue>
   </vlancard>
</topo>

Like the 3rd element, several elements do not have the node. That causes inconsistency on element numbers. i.e.

from xml.dom import minidom
xmldoc = minidom.parse('c:\vlan.xml')
vlId = xmldoc.getElementsByTagName('id')
vlValue = xmldoc.getElementsByTagName('nodeValue')
vlName = xmldoc.getElementsByTagName('vlanName')

after running the module:

IndexError: list index out of range
>>> len(id)
16163
>>> len(vlanName)
16155

Because of this problem , problem occurs for ordering the elements. while printing the table , parser passes the missing elements and element orders are mixed up. I use a simple while loop to insert the values into the table.

x=0
while x < (len(vlId)):
    c.execute('''insert into vlan ('id','nodeValue','vlanName') values ('%s','%s','%s') ''' %(id[x].firstChild.nodeValue, nodeValue[x].firstChild.nodeValue, vlanName[x].firstChild.nodeValue))
    x= x+1

How else can I do this? Any help will be appreciated.

Yusuf


Instead of parsing the entire xml and then inserting, parse each vlancard the retrieve it's id/value/name and then insert them into the DB.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜