Parsing a Comma Delimited File With Python and adding currency fields
I'm trying to use Python to parse a comma delimited file with a layout similar to this:
AccountNumber,Invoice_Number,Gross_Amt,Adjustments,TotalDue
"234","56787","19.37",,"19.37"
"234","56788","204.76","-10.00","194.76"
"234","56789","139.77",,"139.77"
"567","12543","44.89","30.00","74.89"
What I want to accomplish is to total gross amount, adjustments, and Total Due, then add them on to the end of each line (or just on the last line of each document).
My question is how can I create a variable that only adds the fields as long as account number is the same? For example in English I would say:
Check Account num开发者_StackOverflow社区ber: Add Gross_amt on each line while account number is equal to account number on the previous line, then when account number changes append the total of the gross_amt fields as a new field at the end of the last line for that account as Gross_Amt_Total. Start Over.
You could use itertools.groupby()
:
import csv
from itertools import groupby
from operator import itemgetter
with open("data.csv", "rb") as f:
next(f) # Skip header
for account, lines in groupby(csv.reader(f), itemgetter(0)):
gross_amount = 0.
for line in lines:
print line
gross_amount += float(line[2])
print "The total gross amount for account", account, "is", gross_amount
csv module to read the data and itertools.groupby to group by account number:
import csv
from itertools import groupby
from StringIO import StringIO
data = StringIO('''\
AccountNumber,Invoice_Number,Gross_Amt,Adjustments,TotalDue
"234","56787","19.37",,"19.37"
"234","56788","204.76","-10.00","194.76"
"234","56789","139.77",,"139.77"
"567","12543","44.89","30.00","74.89"
''')
# Grab the header and rows of the data.
# groupby requires data sorted on the groupby key.
reader = csv.reader(data)
header = next(reader)
rows = sorted(reader)
print '{:13} {:14} {:9} {:11} {:8}'.format(*header)
# group by first item (acct number)
for acct,grp in groupby(rows,lambda r: r[0]):
print
gross_amt_total = 0
adjustments_total = 0
total_due_total = 0
for item in grp:
# everything comes in as a string, and blank strings don't cvt to float.
gross = float(item[2]) if item[2] else 0.0
adj = float(item[3]) if item[3] else 0.0
due = float(item[4]) if item[4] else 0.0
print '{:13} {:14} {:9.2f} {:11.2f} {:8.2f}'.format(item[0],item[1],gross,adj,due)
gross_amt_total += gross
adjustments_total += adj
total_due_total += due
print
print 'Totals for #{:13} {:9.2f} {:11.2f} {:8.2f}'.format(
acct,gross_amt_total,adjustments_total,total_due_total)
Output
AccountNumber Invoice_Number Gross_Amt Adjustments TotalDue
234 56787 19.37 0.00 19.37
234 56788 204.76 -10.00 194.76
234 56789 139.77 0.00 139.77
Totals for #234 363.90 -10.00 353.90
567 12543 44.89 30.00 74.89
Totals for #567 44.89 30.00 74.89
See here how to read a csv. http://docs.python.org/library/csv.html Basically what u want to do is keep a dictionary that maps the account to the value tou want to total. For each record on the file you index the dict with the account number and add the value you just read with what is already in the dict.
精彩评论