How to turn python list comprehensions into xml

2023-02-11 11:29 问答作者：

I need a little help on finding a tutorial or sample on taking a list comprehension and merging that with a data file from csv and turning all that into an xml file. From reading various python books & pdfs like ditp,IYOCGwP, learnpythonthe hardway,, lxml tut, think python and online searches I am most of the way there or so I think. I just need a push on tying everything together. I am basically taking an excel spreadsheet which I am exporting as a csv file. The csv contains rows of records which I need to map into an xml file. I am new to Python and thought I would use my little project to learn the language. The code listed is not pretty but works. I can read in a csv file and dump that into a list. I can combine 3 lists and output the resulting list and I can get my program to spit out a skeleton xml that is almost laid out in the format that I need. I will list my actual output of a small sample and what I am trying to accomplish with the xml below this code. Sorry if this is too lengthy, this is my first post.

import csv, datetime, os  
from lxml import etree  
from ElementTree_pretty import prettify

f = os.path.getsize("SO.csv")
fh = "SO.csv"
rh = open(fh, "rU")

rows = 0
try:
    rlist = csv.reader(rh)
    reports = []
    for row in rlist:
        '''print row.items()'''
        rowStripped = [x.strip(' ') for x in row]
        reports.append(rowStripped)
        rows +=1
except csv.Error, e:
    sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))

finally:
    rh.close()

root = etree.Element("co_ehs")
object = etree.SubElement(root, "object")
event = etree.SubElement(object, "event")
facets = etree.SubElement(event, "facets")
categories = etree.SubElement(facets, "categories")
instance = etree.SubElement(categories, "instance")
property = etree.SubElement(instance, "property")

facets = ['header','header','header','header','informational','header','informational']

categories =     ['processing','processing','processing','processing','short_title','file_num','short_narrative']

property = ['REPORT ID','NEXT REPORT ID','initial-event-date','number','title','summary-docket-num','description-story']

print('----------Printing Reports from CSV Data----------')
print reports
print('---------END OF CSV DATA-------------')
print
mappings = zip(facets, categories, property)
print('----------Printing Mappings from the zip of facets, categories, property ----------')
print mappings
print('---------END OF List Comprehension-------------')
print
print('----------Printing the xml skeleton that will contain the mappings and the csv data ----------')
print(etree.tostring(root, xml_declaration=True, encoding='UTF-8', pretty_print=True))
print('---------END OF XML Skeleton-------------')

----My OUTPUT---  
----------Printing Reports from CSV Data----------  
[['1', '12-Dec-04', 'Vehicle Collision', '786689', 'No fault collision due to ice', '-1', '545671'], ['3', '15-Dec-04', 'OJT Injury', '87362', 'Paint fumes combusted causing 2nd degree burns', '4', '588456'], ['4', '17-Dec-04', 'OJT Injury', '87362', 'Paint fumes combusted causing 2nd degree burns', '-1', '58871'], ['1000', '12-Nov-05', 'Back Injury', '9854231', 'Lifting without a support device', '-1', '545671'], ['55555', '12-Jan-06', 'Foot Injury', '7936547', 'Office injury - heavy item dropped on foot', '-1', '545671']]  
---------END OF CSV DATA-------------  
----------Printing Mappings from the zip of facets, categories, property ----------  
[('header', 'processing', 'REPORT ID'), ('header', 'processing', 'NEXT REPORT ID'), ('header', 'processing', 'initial-event-date'), ('header', 'processing', 'number'), ('informational', 'short_title', 'title'), ('header', 'file_num', 'summary-docket-num'), ('informational', 'short_narrative', 'description-story')]  
---------END OF List Comprehension-------------  
----------Printing the xml skeleton that will contain the mappings and the csv data ----------  

    <?xml version='1.0' encoding='UTF-8'?>
    <co_ehs>
      <object>
        <event>
          <facets>
            <categories>
              <instance>
                <property/>
              </instance>
            </categories>
          </facets>
        </event>
      </object>
</co_ehs>

---------END OF XML Skeleton-------------  
----------CSV DATA------------------  
C_ID,NEXT_C_ID,C_DATE,C_NUMBER,C_EVENT,C_DOCKETNUM,C_DESCRIPTION  
1,-1,12-Dec-04,545671,Vehicle Collision,786689,"No fault collision due to ice"  
3,4,15-Dec-04,588456,OJT Injury,87362,"Paint fumes combusted causing 2nd degree burns"  
4,-1,17-Dec-04,58871,OJT Injury,87362,"Paint fumes combusted causing 2nd degree burns"  
1000,-1,12-Nov-05,545671,Back Injury,9854231,"Lifting without a support device"  
55555,-1,12-Jan-06,545671,Foot Injury,7936547,"Office injury - heavy item dropped on foot"  

-----------What I want the xml output to look like----------------------  
    <?xml version="1.0" encoding="UTF-8"?>
    <co_ehs xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="co_ehs.xsd">  
      <object id="3" object-type="ehs_report">
        <event event-tag="0">
          <facets name="header">
            <categories name="processing">
              <instance instance-tag="0">
                <property name="REPORT ID" value="1"/>
                <property name="NEXT REPORT ID" value="-1"/>
                <property name="initial-event-date" value="12-Dec-04"/>
                <property name="number" value="545671"/>
              </instance>
            </categories>
          </facets>
          <facets name="informational">
            <categories name="short_title">
              <instance-tag="0">
                <property name="title" 开发者_如何学Pythonvalue="Vehicle Collision"/>
              </instance>
            </categories>
          </facets>
          <facets name="header">
            <categories name="file_num">
              <instance-tag="0">
                <property name="summary-docket-num" value="786689"/>
              </instance>
            </categories>
          </facets>
          <facets name="informational">
            <categories name="short_narrative">
              <instance-tag="0">
                <property name="description-story" value="No fault collision due to ice"/>
              </instance>
            </categories>
          </facets>
        </event>
      </object>
    </co_ehs>

Here is my solution. I use lxml, because it's normally better to generate XML with a framework than with strings or a template file.

The attributes of co_ehs are missing, but this could easily be fixed with some set()-calls. I leave it up to you to do this.

BTW: You can accept the best answer by clicking on the check mark on the left side of the answer

import csv, datetime, os  
from lxml import etree

def makeFacet(event, newheaders, ev, facetname, catname, count, nhposstart, nhposend):
    facets = etree.SubElement(event, "facets", name=facetname)
    categories = etree.SubElement(facets, "categories", name=catname)
    instance = etree.SubElement(categories, "instance") 
    instance.set("instance-tag", count)

    for i in range(nhposstart, nhposend):
        property = etree.SubElement(instance, "property")
        property.set("name", newheaders[i])
        property.set("value", ev[i].strip())


# read the csv
fh = "SO.csv"
rh = open(fh, "rU")

try:
    rlist = list(csv.reader(rh))
except csv.Error as e:
    sys.exit("file %s, line %d: %s" % (filename, reader.line_num, e))
finally:
    rh.close()

# generate the xml

# newheaders is a mapping of the csv column names, because they don't correspondent w/ the XML
newheaders = ["REPORT_ID","NEXT_REPORT_ID","initial-event-date","number","title","summary-docket-num", "description-story"]

root = etree.Element("co_ehs")

object = etree.SubElement(root, "object")

object.set("id", "3") # Not sure about this one
object.set("object-type", "ehs-report")

for c, ev in enumerate(rlist[1:]):
    event  = etree.SubElement(object, "event")
    event.set("event-tag", "%s"%c) 
    makeFacet(event, newheaders, ev, "header", "processing", "%s"%c, 0, 4)
    makeFacet(event, newheaders, ev, "informational", "short-title", "%s"%c, 4, 5)
    makeFacet(event, newheaders, ev, "header", "file_num", "%s"%c, 5, 6)
    makeFacet(event, newheaders, ev, "informational", "short_narrative", "%s"%c, 6, 7)

print(etree.tostring(root, xml_declaration=True, encoding="UTF-8", pretty_print=True))

I created a file with name 'pattern.txt' and following content (with this indentation).

Notice the 8 %s put at strategic places.

        <event event-tag="%s">
          <facets name="header">
            <categories name="processing">
              <instance instance-tag="0">
                <property name="REPORT ID" value="%s"/>
                <property name="NEXT REPORT ID" value="%s"/>
                <property name="initial-event-date" value="%s"/>
                <property name="number" value="%s"/>
              </instance>
            </categories>
          </facets>
          <facets name="informational">
            <categories name="short_title">
              <instance-tag="0">
                <property name="title" value="%s"/>
              </instance>
            </categories>
          </facets>
          <facets name="header">
            <categories name="file_num">
              <instance-tag="0">
                <property name="summary-docket-num" value="%s"/>
              </instance>
            </categories>
          </facets>
          <facets name="informational">
            <categories name="short_narrative">
              <instance-tag="0">
                <property name="description-story" value="%s"/>
              </instance>
            </categories>
          </facets>
        </event>

I created file 'SO.csv' with folowing content:

C_ID,NEXT_C_ID,C_DATE,C_NUMBER,C_EVENT,C_DOCKETNUM,C_DESCRIPTION  
1,-1,12-Dec-04,545671,Vehicle Collision,786689,"No fault collision due to ice"  
3,4,15-Dec-04,588456,OJT Injury,87362,"Paint fumes combusted causing 2nd degree burns"  
4,-1,17-Dec-04,58871,OJT Injury,87362,"Paint fumes combusted causing 2nd degree burns"  
1000,-1,12-Nov-05,545671,Back Injury,9854231,"Lifting without a support device"  
55555,-1,12-Jan-06,545671,Foot Injury,7936547,"Office injury - heavy item dropped on foot"

And I ran the following code:

import csv

rid = csv.reader(open('SO.csv','rb'))
rid.next()

with open('pattern.txt') as f:
    pati = f.read()

xmloutput = ['    <?xml version="1.0" encoding="UTF-8"?>',
             '    <co_ehs xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" '\
             'xsi:noNamespaceSchemaLocation="co_ehs.xsd">',
             '      <object id="3" object-type="ehs_report">']

for i,row in enumerate(rid):
    row[0:0] = str(i)
    xmloutput.append( pati % tuple(row) )

print '\n'.join(xmloutput)

Does this help you ?

继续阅读：csv list python xml

How to turn python list comprehensions into xml

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？