Python table classification
I have different type of data for example:
4.5,3.5,U1开发者_Go百科
4.5,10.5,U2
4.5,6,U1
3.5,10.5,U2
3.5,10.5,U2
5,7,U1
7,6.5,U1
I need output:
'U1': [['4.5', '3.5'], ['4.5', '6'], ['5', '7'], ['7', '6.5']]
'U2': [['4.5', '10.5'], ['3.5', '10.5'], ['3.5', '10.5']]
So my code is:
import csv
reader = csv.reader(open('test.data', 'r'))
result = {}
for row in reader:
uclass=row[-1]
if result.has_key(uclass):
result[uclass].append([row[0],row[1]]) #--->how can I change from 0 to -2 row ??
else:
result[uclass]=[[row[0],row[1]]] #--->-->how can I change from 0 to -2 row ??
print repr(result)
But I need this code for any other input data, where there is many rows, not just 3!
See comment in code
result[uclass].append(row[:-1])
and
result[uclass] = row[:-1]
This notation is called slicing.
This perhaps?
data = """\
4.5,3.5,U1
4.5,10.5,U2
4.5,6,U1
3.5,10.5,U2
3.5,10.5,U2
5,7,U1
7,6.5,U1""".splitlines()
from collections import defaultdict
dd = defaultdict(list)
for d in data:
dl = d.split(',')
dd[dl[-1]].append(list(map(float, dl[:-1])))
for key in dd:
print key, dd[key]
prints:
U1 [[4.5, 3.5], [4.5, 6.0], [5.0, 7.0], [7.0, 6.5]]
U2 [[4.5, 10.5], [3.5, 10.5], [3.5, 10.5]]
Here is your code i made change a little bit.
import csv
reader = csv.reader(open('test.data', 'r'))
result = {}
for row in reader:
#print row
if(len(row) == 0):
continue;
uclass = row[-1]
if result.has_key(uclass):
result[uclass].append([row[:-1]]) #--->how can I change from 0 to -2 row ??
else:
result[uclass]=[[row[:-1]]] #--->-->how can I change from 0 to -2 row ??
print repr(result)
I tested on the following data, it works.
5.66,4.5,3.5,U1
4.5,23123,34,10.5,U2
4.5,6,U1
3.5,10.5,U2
3.5,10.5,U2
5,7,U1
7,6.5,U1
4.5,45,73.3,56,66,72.5,U3
import csv
import collections
def main():
with open('testdata.csv', 'rb') as inf:
incsv = csv.reader(inf)
res = collections.defaultdict(list)
for row in incsv:
key = row.pop()
res[key].append([float(r) for r in row])
for key,val in res.iteritems():
print("{0}: {1}".format(key, val))
if __name__=="__main__":
main()
results in
U1: [[4.5, 3.5], [4.5, 6.0], [5.0, 7.0], [7.0, 6.5]]
U2: [[4.5, 10.5], [3.5, 10.5], [3.5, 10.5]]
Comments:
csv.reader expects binary files - use 'rb' as read mode
slicing a list creates a new copy of the list; pop does not, therefore is more efficient in time and memory.
精彩评论