How to remove 'None' from an Appended Multidimensional Array using numpy

2023-01-10 18:32 问答作者：

I need to take a csv file and import this data into a multi-dimensional array in python, but I am not sure how t开发者_JAVA技巧o strip the 'None' values out of the array after I have appended my data to the empty array.

I first created a structure like this:

storecoeffs = numpy.empty((5,11), dtype='object')

This returns an 5 row by 11 column array populated by 'None'.

Next, I opened my csv file and converted it to an array:

coeffsarray = list(csv.reader(open("file.csv")))

coeffsarray = numpy.array(coeffsarray, dtype='object')

Then, I appended the two arrays:

newmatrix = numpy.append(storecoeffs, coeffsarray, axis=1)

The result is an array populated by 'None' values followed by the data that I want (first two rows shown to give you an idea as to the nature of my data):

array([[None, None, None, None, None, None, None, None, None, None, None,
    workers, constant, hhsize, inc1, inc2, inc3, inc4, age1, age2,
    age3, age4],[None, None, None, None, None, None, None, None, None, None, None,
    w0, 7.334, -1.406, 2.823, 2.025, 0.5145, 0, -4.936, -5.054, -2.8, 0],,...]], dtype=object)

How do I remove those 'None' objects from each row so what I am left with is the 5 x11 multidimensional array with my data?

@Gnibbler's answer is technically correct, but there's no reason to create the initial storecoeffs array in the first place. Just load in your values and then create an array from them. As @Mermoz noted, though, your use case looks simple enough for numpy.loadtxt().

Beyond that, why are you using an object array?? It's probably not what you want... Right now, you're storing the numerical values as strings, not floats!

You have essentially two ways to handle your data in numpy. If you want easy access to named columns, use a structured array (or a record array). If you want to have a "normal" multidimensional array, just use an array of floats, ints, etc. Object arrays have a specific purpose, but it's probably not what you're doing.

For example: To just load in the data as a normal 2D numpy array (assuming all your data can be represented easily as a float):

import numpy as np
# Note that this ignores your column names, and attempts to 
# convert all values to a float...
data = np.loadtxt('input_filename.txt', delimiter=',', skiprows=1)

# Access the first column 
workers = data[:,0]

To load your data in as a structured array, you might do something like this:

import numpy as np
infile = file('input_filename.txt')

# Read in the names of the columns from the first row...
names = infile.next().strip().split()

# Make a dtype from these names...
dtype = {'names':names, 'formats':len(names)*[np.float]}

# Read the data in...
data = np.loadtxt(infile, dtype=dtype, delimiter=',')

# Note that data is now effectively 1-dimensional. To access a column,
# index it by name
workers = data['workers']

# Note that this is now one-dimensional... You can't treat it like a 2D array
data[1:10, 3:5] # <-- Raises an error!

data[1:10][['inc1', 'inc2']] # <-- Effectively the same thing, but works..

If you have non-numerical values in your data and want to handle them as strings, you'll need to use a structured array, specify which fields you want to be strings, and set a max length for the strings in the field.

From your sample data, it looks like the first column, "workers" is a non-numerical value that you might want to store as a string and all the rest look like floats. In that case, you'd do something like this:

import numpy as np
infile = file('input_filename.txt')
names = infile.next().strip().split()

# Create the dtype... The 'S10' indicates a string field with a length of 10
dtype = {'names':names, 'formats':['S10'] + (len(names) - 1)*[np.float]}
data = np.loadtxt(infile, dtype=dtype, delimiter=',')

# The "workers" field is now a string array
print data['workers']

# Compare this to the other fields
print data['constant']

If there are cases where you really need the flexibility of the csv module (e.g. text fields with commas), you can use it to read the data, and then convert it to a structured array with the appropriate dtype.

Hope that makes things a bit clearer...

Start with an empty array?

storecoeffs = numpy.empty((5,0), dtype='object')

Why are you allocating an entire array of Nones and appending to that? Is coeffsarray not the array you want?

Edit

Oh. Use numpy.reshape.

import numpy
coeffsarray = numpy.reshape( coeffsarray, ( 5, 11 ) )

why not simply using numpy.loadtxt():

newmatrix = numpy.loadtxt("file.csv", dtype='object')

should do the job, if i understood well you question.

继续阅读：extract multidimensional-array numpy python slice

How to remove 'None' from an Appended Multidimensional Array using numpy

Edit

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Edit

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？