开发者

Creating a list from a CSV file using Python

I have a Python script so far that does what I it to... Opens the CSV Defined by the user, splits the file into different Predefined "pools" and remakes them again into their own files, with proper headers. My only problem is I want to change the Pool list from a static to a variable; and having some issues.

The pool list is in the CSV it self, in column 2. and can be duplicated. Right now with this setup the system can create "Dead" Files with no data aside from the header.

A few notes: Yes I know spelling is not perfect and yes I know what some of my comments are a bit off

import csv
#used to read ane make CSV's
import time
#used to timestamp files
import tkFileDialog
#used to allow user input
filename = tkFileDialog.askopenfilename(defaultextension = ".csv")
#Only user imput to locate the file it self
csvfile = [] 
#Declairs csvfile as a empty list
pools = ["1","2","4","6","9","A","B","D","E","F","I","K","L","M","N","O","P","W","Y"]
#declairs hte pools list for known pools
for i in pools:
    #uses the Pools List and makes a large number of variables
    exec("pool"+i+"=[]")
reader = csv.reader(open(filename, "rb"), delimiter = ',')
 #Opens the CSV for the reader to use
for row in reader: 
    csvfile.append(row) 
    #dumps the CSV into a varilable
    headers=[]
    #declairs headers as empty list
    headers.append(csvfile[0])
    #appends the first row to the header variable
for row in csvfile: 
    pool = str(row[1]).capitalize()
    #Checks to make sure all pools in the main data are capitalized
    if pool in pools:
        exec("pool"+pool+".append(row)")
        #finds the pool list and appends the new item into the variable list
    else: 
        pass
for i in pools:
    exec("wp=csv.writer(open('pool "+i+" "+time.strftime("%Y%m%d")+".csv','wb'),)")
    wp.writerows(headers)
    #Adds the header row
    exec("wp.writerows(pool"+i+")")
    #Created the CSV with a timestamp useing the pool list
    #-----Needs Headers writen in on each file -----

EDIT: As there have been some questions

Reason for the code: I have Daily reports that are being generated, Part of these reports that require a manual process is splitting these reports into different Pool Reports. I was creating this script so that I could quickly select the file it self and quickly split these out into their own files.

The main CSV can be from 50 to 100 items long, it has a total of 25 Columns and the Pool Is always going to be listed on the second column. Not all Pools will be listed all the time, and pools will show up more then once.

I have tried a few different loops so far; one is as follows

pools = [] for line in file(open(filename,'rb')): line = line.split() x = line[1] pools.append(x)

But I get a List error with this.

A example of the CSV:

Ticket Pool Date Column 4 Column 5

1   A   11/8/2010   etc etc

2   A   11/8/2010   etc etc

3   1   11/8/2010   etc etc开发者_JS百科

4   6   11/8/2010   etc etc

5   B   11/8/2010   etc etc

6   A   11/8/2010   etc etc

7   1   11/8/2010   etc etc

8   2   11/8/2010   etc etc

9   2   11/8/2010   etc etc

10  1   11/8/2010   etc etc


If I am understanding correctly what you want to achieve here this could be as solution:

import csv
import time
import tkFileDialog

filename = tkFileDialog.askopenfilename(defaultextension = ".csv")

reader = csv.reader(open(filename, "rb"), delimiter = ',')

headders = reader.next()

pool_dict = {}

for row in reader:
    if not pool_dict.has_key(row[1]):
        pool_dict[row[1]] = []
    pool_dict[row[1]].append(row)
       
for key, val in pool_dict.items():
    wp = csv.writer(open('pool ' +key+ ' '+time.strftime("%Y%m%d")+'.csv','wb'),)
    wp.writerow(headders)
    wp.writerows(val)

EDIT: misunderstood the headers and pools thing in the first place and tried to correct the issue.

EDIT 2: corrected the pool to be dynamically created from values found in file.

If not, please provide more details of your Problem…


Can you describe your CSV file a little bit?

One suggestion is to change

for i in pools:
#uses the Pools List and makes a large number of variables
    exec("pool"+i+"=[]")

to the more pythonic form:

pool_dict = {}
for i in pools:
    pool_dict[i] = []

In general its bad to using eval/exec and much easier to say loop through a dictionary. E.g., access variables by pool_dict['A'], pool_dict['1'] or loop through all of them like

for key,val in pool_dict.items():
   val.append(...)

EDIT: Now seeing the CSV data, try something like this:

for row in reader:
    if row[0] == 'Ticket':
        header = row
    else:
        cur_pool = row[1].capitalize()
        if not pool_dict.has_key(cur_pool):
            pool_dict[cur_pool] = [row,]
        else:
            pool_dict[cur_pool].append(row)

for p, pool_vals in pool_dict.items:
    with open('pool'+p+'_'+time.strftime("%Y%m%d")+'.csv','wb'),) as fp:
        wp = csv.writer(fp)
        wp.writerow(header)
        wp.writerows(pool_vals)


You code would be a lot easier to read without all those execs. It seems like you used them to declare all of your variables, when in fact you could declare a list of pools like this:

pool_lists = [[] for p in pools]

This is my best guess for what you mean by "I want to change the Pool list from a static to a variable." When you do this, you will have a list of lists, of the same length as pools.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜