开发者

pythonic way to access data in file structures

I want to access every value (~10000) in .txt files (~1000) stored in 开发者_如何学编程directories (~20) in the most efficient manner possible. When the data is grabbed I would like to place them in a HTML string. I do this in order to display a HTML page with tables for each file. Pseudo:

    fh=open('MyHtmlFile.html','w')
    fh.write('''<head>Lots of tables</head><body>''')
    for eachDirectory in rootFolder:

        for eachFile in eachDirectory:
            concat=''

            for eachData in eachFile:
               concat=concat+<tr><td>eachData</tr></td>
            table='''
                  <table>%s</table>
                  '''%(concat)
        fh.write(table)
    fh.write('''</body>''')
    fh.close()

There must be a better way (I imagine it would take forever)! I've checked out set() and read a bit about hashtables but rather ask the experts before the hole is dug.

Thank you for your time! /Karl


import os, os.path
# If you're on Python 2.5 or newer, use 'with'
# needs 'from __future__ import with_statement' on 2.5
fh=open('MyHtmlFile.html','w')
fh.write('<html>\r\n<head><title>Lots of tables</title></head>\r\n<body>\r\n')
# this will recursively descend the tree
for dirpath, dirname, filenames in os.walk(rootFolder):
    for filename in filenames:
        # again, use 'with' on Python 2.5 or newer
        infile = open(os.path.join(dirpath, filename))
        # this will format the lines and join them, then format them into the table
        # If you're on Python 2.6 or newer you could use 'str.format' instead
        fh.write('<table>\r\n%s\r\n</table>' % 
                     '\r\n'.join('<tr><td>%s</tr></td>' % line for line in infile))
        infile.close()
fh.write('\r\n</body></html>')
fh.close()


Why do you "imagine it would take forever"? You are reading the file and then printing it out - that's pretty much the only thing you have as a requirement - and that's all you're doing. You could tweak the script in a couple of ways (read blocks not lines, adjust buffers, print out instead of concatenating, etc.), but if you don't know how much time do you take now, how do you know what is better/worse?

Profile first, then find if the script is too slow, then find a place where it's slow, and only then optimise (or ask about optimisation).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜