开发者

How to sum all the numbers of one column in python?

I need to know how to sum all the numbers of a column in a CSV file.

For example. My data looks like this:

column  count   min max sum mean
80  29573061    2   40  855179253   28.92
81  28861459    2   40  802912711   27.82
82  28165830    2   40  778234605   27.63
83  27479902    2   40  754170015   27.44
84  26800815    2   40  729443846   27.22
85  26127825    2   40  701704155   26.86
86  25473985    2   40  641663075   25.19
87  24827383    2   40  621981569   25.05
88  24189811    2   40  602566423   24.91
89  23566656    2   40  579432094   24.59
90  22975910    2   40  553092863   24.07
91  22412345    2   40  492993262   22
92  21864206    2   40  475135290   21.73
93  21377772    2   40  461532152   21.59
94  20968958    2   40  443921856   21.17
95  20593463    2   40  424887468   20.63
96  20329969    2   40  364319592   17.92
97  20157643    2   40  354989240   17.61
98  20104046    2   40  349594631   17.39
99  20103866    2   40  342152213   17.02
100 20103866    2   40  335379448   16.6
#But it's separated by tabs

The code I've write so far is:

import sys
import csv

def ErrorCalculator(file):
        reader = csv.reader(open(file), dialect='excel-tab' )

     开发者_运维技巧   for row in reader:
                PxCount = 10**(-float(row[5])/10)*float(row[1])


if __name__ == '__main__':
        ErrorCalculator(sys.argv[1])

For this particular code I need to sum all the numbers in PxCount and divide by the sum of all numbers in row[1]...

I'll be so grateful if tell me how to sum the numbers of a column or if you help me with this code.

Also if you can give me a tip to skip the header.


You can call "reader.next()" right after instantiating the reader to discard the first line.

To sum the PxCount, just set sum = 0 before your loop and sum += PxCount after you calculate it for each row.

PS You might find the csv.DictReader helpful too.


You could keep a running total using an "augmented assignment" +=:

total=0
for row in reader:
        PxCount = 10**(-float(row[5])/10)*float(row[1])
        total+=PxCount

To skip the first line (header) in the csv file:

with open(file) as f:
    next(f)  # read and throw away first line in f
    reader = csv.reader(f, dialect='excel-tab' )


Using a DictReader will result in far clearer code. Decimal will give you better precision. Also try to follow python naming conventions and use lowercase names for functions and variables.

import decimal

def calculate(file):
    reader = csv.DictReader(open(file), dialect='excel-tab' )
    total_count = 0
    total_sum = 0
    for row in reader:
        r_count = decimal.Decimal(row['count'])
        r_sum = decimal.Decimal(row['sum'])
        r_mean = decimal.Decimal(row['mean'])
        # not sure if the below formula is actually what you want
        total_count += 10 ** (-r_mean / 10) * r_count
        total_sum += r_sum
    return total_count / total_sum
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜