vectorizing a for loop in numpy/scipy?
I'm trying to vectorize a for loop that I have inside of a class method. The for loop has the following form: it iterates through a bunch of points and depending on whether a certain variable (called "self.condition_met" below) is true, calls a pair of functions on the point, and adds the result to a list. Each point here is an element in a vector of lists, i.e. a data structure that looks like array([[1,2,3], [4,5,6], ...]). Here is the problematic function:
def myClass:
def my_inefficient_method(self):
final_vector = []
# Assume 'my_vector' and 'my_other_vector' are defined numpy arrays
for point in all_points:
if not self.condition_met:
a = self.my_func1(point, my_vector)
b = self.my_func2(point, my_other_vector)
else:
a = self.my_func3(point, my_vector)
b = self.my_func4(point, my_other_vector)
c = a + b
final_vector.append(c)
# Choose random element from resulting vector 'final_开发者_开发知识库vector'
self.condition_met is set before my_inefficient_method is called, so it seems unnecessary to check it each time, but I am not sure how to better write this. Since there are no destructive operations here it is seems like I could rewrite this entire thing as a vectorized operation -- is that possible? any ideas how to do this?
This only takes a couple lines of code in NumPy (the rest is just creating a data set, a couple of functions, and set-up).
import numpy as NP
# create two functions
fnx1 = lambda x : x**2
fnx2 = lambda x : NP.sum(fnx1(x))
# create some data
M = NP.random.randint(10, 99, 40).reshape(8, 5)
# creates index array based on condition satisfaction
# (is the sum (of that row/data point) even or odd)
ndx = NP.where( NP.sum(M, 0) % 2 == 0 )
# only those data points that satisfy the condition (are even)
# are passed to one function then another and the result off applying both
# functions to each data point is stored in an array
res = NP.apply_along_axis( fnx2, 1, M[ndx,] )
print(res)
# returns: [[11609 15309 15742 12406 4781]]
From your description i abstracted this flow:
- check for condition (boolean) 'if True'
- calls pair functions on those data points (rows) that satisfy the condition
- appends result from each set of calls to a list ('res' below)
Can you rewrite my_funcx
to be vectorized? If so, you can do
def myClass:
def my_efficient_method(self):
# Assume 'all_points', 'my_vector' and 'my_other_vector' are defined numpy arrays
if not self.condition_met:
a = self.my_func1(all_points, my_vector)
b = self.my_func2(all_points, my_other_vector)
else:
a = self.my_func3(all_points, my_vector)
b = self.my_func4(all_points, my_other_vector)
final_vector = a + b
# Choose random element from resulting vector 'final_vector'
It's probably best to do what mtrw, but if you're not sure about vectorizing, you can try numpy.vectorize on the my_func
s
http://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html
精彩评论