开发者

Reading file string into an array (In a pythonic way)

I'm reading lines from a file to then work with them. Each line is composed so开发者_如何学编程lely by float numbers.

I have pretty much everything sorted up to convert the lines into arrays.

I basically do (pseudopython code)

 line=file.readlines()
 line=line.split(' ') # Or whatever separator
 array=np.array(line)
 #And then iterate over every value casting them as floats
      newarray[i]=array.float(array[i])

This works, buts seems a bit counterintuitive and antipythonic, I wanted to know if there is a better way to handle the inputs from a file to have at the end an array full of floats.


Quick answer:

arrays = []
for line in open(your_file): # no need to use readlines if you don't want to store them
    # use a list comprehension to build your array on the fly
    new_array = np.array((array.float(i) for i in line.split(' '))) 
    arrays.append(new_array)

If you process often this kind of data, the csv module will help.

import csv

arrays = []
# declare the format of you csv file and Python will turn line into
# lists for you 
parser = csv.reader(open(your_file), delimiter=' '))
for l in parser: 
    arrays.append(np.array((array.float(i) for i in l)))

If you feel wild, you can even make this completly declarative:

import csv

parser = csv.reader(open(your_file), delimiter=' '))
make_array = lambda row : np.array((array.float(i) for i in row)) 
arrays = [make_array(row) for row in parser]

And if you realy want you colleagues to hate you, you can make a one liner (NOT PYTHONIC AT ALL :-):

arrays = [np.array((array.float(i) for i in r)) for r in csv.reader(open(your_file), delimiter=' '))]

Stripping all the boiler plate and flexibility, you can end up with a clean and quite readable one liner. I wouldn't use it because I like the refatoring potential of using csv, but it can be good enought. It's a grey zone here, so I wouldn't say it's Pythonic, but it's definitly handy.

arrays = [np.array((array.float(i) for i in l.split())) for l in open(your_file))]


If you want a numpy array and each row in the text file has the same number of values:

a = numpy.loadtxt('data.txt')

Without numpy:

with open('data.txt') as f:
    arrays = list(csv.reader(f, delimiter=' ', quoting=csv.QUOTE_NONNUMERIC))

Or just:

with open('data.txt') as f:
    arrays = [map(float, line.split()) for line in f]


How about the following:

import numpy as np

arrays = []
for line in open('data.txt'):
  arrays.append(np.array([float(val) for val in line.rstrip('\n').split(' ') if val != '']))


One possible one-liner:

a_list = [map(float, line.split(' ')) for line in a_file]

Note that I used map() here instead of a nested list comprehension to aid readability.

If you want a numpy array:

an_array = np.array([map(float, line.split(' ')) for line in a_file])


I would use regular expressions

import re

all_lines = ''.join( file.readlines() )

new_array = np.array( re.findall('[\d.E+-]+', all_lines), float)

np.reshape( new_array, (m,n) )

First merging the files into one long string, and then extracting only the expressions corresponding to floats ( '[\d.E+-]' for scientific notation, but you can also use '[\d.]' for only float expressions).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜