How do I open all files of a certain type in Python and process them?

2022-12-15 19:07 问答作者：

I'm trying to figure out how to make python go through a directory full of csv files, process each of the files and spit out a text file with a trimmed list of values.

In this example, I'm iterating through a CSV with lots of different types of columns but all I really want开发者_如何学编程 are the first name, last name, and keyword. I have a folder full of these csvs with different columns (except they all share first name, last name, and keyword somewhere in the csv). What's the best way to open that folder, go through each csv file, and then spit it all out as either its own csv file for just a text list as I have in the example below.

import csv
reader = csv.reader(open("keywords.csv"))
rownum = 0
headnum = 0
F = open('compiled.txt','w')
for row in reader:
    if rownum == 0:
        header = row;
        for col in row:
            if header[headnum]=='Keyword':
                keywordnum=headnum;
            elif header[headnum]=='First Name':
                firstnamenum=headnum;
            elif header[headnum]=='Last Name':
                lastnamenum=headnum;
            headnum +=1
    else:
        currentrow=row
        print(currentrow[keywordnum] + '\n' + currentrow[firstnamenum] + '\n' + currentrow[lastnamenum]) 
        F.write(currentrow[keywordnum] + '\n')

    rownum +=1

The best way is probably to use the shell's globbing ability, or alternatively the glob module of Python.

Shell (Linux, Unix)

Shell:

python myapp.py folder/*.csv

myapp.py:

import sys
for filename in sys.argv[1:]:
    with open(filename) as f:
        # do something with f

Windows (Or no shell available.)

import glob
for filename in glob.glob("folder/*.csv"):
    with open(filename) as f:
        # do something with f

Note: Python 2.5 needs from __future__ import with_statement

The "get all the CSV files" part of the question has been answered several times (including by the OP), but the "get the right named columns" hasn't yet: csv.DictReader makes it trivial -- the "process one CSV file" loop becomes just:

reader = csv.DictReader(open(thecsvfilename))
for row in reader:
    print('\n'.join(row['Keyword'], row['First Name'], row['Last Name'])) 
    F.write(row['Keyword'] + '\n')

A few suggestions:

You could keep the header indices for Keyword, First Name, and Last Name in a map instead of using separate variables. This would make it easier to modify the script later on.

You could use the list index() function instead of looping over the headers, e.g.:

if rownum == 0:
    for header in ('Keyword', 'First Name', 'Last Name'):
        header_index[header] = row.index(header)

You could use the glob module to grab the filenames, but gs is probably right that shell globbing is a better way to do it.
It might be better to use the csv module for writing the file as well; I think it handles escaping, so it would probably be more robust.

I think the best way to process a bunch of files in a directory is with os.walk (documented in the Python os module docs here.

Here is an answer I wrote to another Python question, which includes working tested Python code to use os.walk to open a bunch of files. This version visits all subdirectories too, but it would be easy to modify it to just stay in the one directory.

Replace strings in files by Python

And I've answered my own question again... I imported the os and glob modules to nab a path.

继续阅读：python

How do I open all files of a certain type in Python and process them?

Shell (Linux, Unix)

Windows (Or no shell available.)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

抽烟只抽炫赫门？

Infinite gtk warnings when I right click on the icon

Shell (Linux, Unix)

Windows (Or no shell available.)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

抽烟只抽炫赫门？

Infinite gtk warnings when I right click on the icon

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？