开发者

Python table classification

I have different type of data for example:

4.5,3.5,U1开发者_Go百科
4.5,10.5,U2
4.5,6,U1
3.5,10.5,U2
3.5,10.5,U2
5,7,U1
7,6.5,U1

I need output:

'U1': [['4.5', '3.5'], ['4.5', '6'], ['5', '7'], ['7', '6.5']]
'U2': [['4.5', '10.5'], ['3.5', '10.5'], ['3.5', '10.5']]

So my code is:

import csv

reader = csv.reader(open('test.data', 'r'))
result = {}
for row in reader:
    uclass=row[-1]
    if result.has_key(uclass):
        result[uclass].append([row[0],row[1]])       #--->how can I change from 0 to -2 row ??
    else:
        result[uclass]=[[row[0],row[1]]]             #--->-->how can I change from 0 to -2 row ??
print repr(result)

But I need this code for any other input data, where there is many rows, not just 3!

See comment in code


result[uclass].append(row[:-1])

and

result[uclass] = row[:-1]  

This notation is called slicing.


This perhaps?

data = """\
4.5,3.5,U1
4.5,10.5,U2
4.5,6,U1
3.5,10.5,U2
3.5,10.5,U2
5,7,U1
7,6.5,U1""".splitlines()

from collections import defaultdict
dd = defaultdict(list)
for d in data:
    dl = d.split(',')
    dd[dl[-1]].append(list(map(float, dl[:-1])))

for key in dd:
    print key, dd[key]

prints:

U1 [[4.5, 3.5], [4.5, 6.0], [5.0, 7.0], [7.0, 6.5]]
U2 [[4.5, 10.5], [3.5, 10.5], [3.5, 10.5]]


Here is your code i made change a little bit.

import csv
reader = csv.reader(open('test.data', 'r'))
result = {}
for row in reader:
  #print row
  if(len(row) == 0):
    continue;
  uclass = row[-1]
  if result.has_key(uclass):
    result[uclass].append([row[:-1]])       #--->how can I change from 0 to -2 row ??
  else:
    result[uclass]=[[row[:-1]]]             #--->-->how can I change from 0 to -2 row ??
print repr(result)

I tested on the following data, it works.

5.66,4.5,3.5,U1
4.5,23123,34,10.5,U2
4.5,6,U1
3.5,10.5,U2
3.5,10.5,U2
5,7,U1
7,6.5,U1
4.5,45,73.3,56,66,72.5,U3


import csv
import collections

def main():
    with open('testdata.csv', 'rb') as inf:
        incsv = csv.reader(inf)
        res = collections.defaultdict(list)
        for row in incsv:
            key = row.pop()
            res[key].append([float(r) for r in row])

    for key,val in res.iteritems():
        print("{0}: {1}".format(key, val))

if __name__=="__main__":
    main()

results in

U1: [[4.5, 3.5], [4.5, 6.0], [5.0, 7.0], [7.0, 6.5]]
U2: [[4.5, 10.5], [3.5, 10.5], [3.5, 10.5]]

Comments:

  1. csv.reader expects binary files - use 'rb' as read mode

  2. slicing a list creates a new copy of the list; pop does not, therefore is more efficient in time and memory.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜