开发者

Numpy Lookup (Map, or Point)

I have a large numpy array:

array([[32, 32, 99,  9, 45],  # A
       [99, 45,  9, 45, 32],
       [45, 45, 99, 99, 32],
       [ 9,  9, 32, 45, 99]])

and a large-开发者_开发技巧ish array of unique values in a particular order:

array([ 99, 32, 45, 9])       # B

How can I quickly (no python dictionaries, no copies of A, no python loops) replace the values in A so that become the indicies of the values in B?:

array([[1, 1, 0, 3, 2],
       [0, 2, 3, 2, 1],
       [2, 2, 0, 0, 1],
       [3, 3, 1, 2, 0]])

I feel reaaly dumb for not being able to do this off the top of my head, nor find it in the documentation. Easy points!


Here you go

A = array([[32, 32, 99,  9, 45],  # A
   [99, 45,  9, 45, 32],
   [45, 45, 99, 99, 32],
   [ 9,  9, 32, 45, 99]])

B = array([ 99, 32, 45, 9])

ii = np.argsort(B)
C = np.digitize(A.reshape(-1,),np.sort(B)) - 1

Originally I suggested:

D = np.choose(C,ii).reshape(A.shape)

But I realized that that had limitations when you went to larger arrays. Instead, borrowing from @unutbu's clever reply:

D = np.argsort(B)[C].reshape(A.shape)

Or the one-liner

np.argsort(B)[np.digitize(A.reshape(-1,),np.sort(B)) - 1].reshape(A.shape)

Which I found to be faster or slower than @unutbu's code depending on the size of the arrays under consideration and the number of unique values.


import numpy as np
A=np.array([[32, 32, 99,  9, 45],  
            [99, 45,  9, 45, 32],
            [45, 45, 99, 99, 32],
            [ 9,  9, 32, 45, 99]])

B=np.array([ 99, 32, 45, 9])

cutoffs=np.sort(B)
print(cutoffs)
# [ 9 32 45 99]

index=cutoffs.searchsorted(A)
print(index)
# [[1 1 3 0 2]
#  [3 2 0 2 1]
#  [2 2 3 3 1]
#  [0 0 1 2 3]]    

index holds the indices into the array cutoff associated with each element of A. Note we had to sort B since np.searchsorted expects a sorted array.

index is almost the desired answer, except that we want to map

1-->1
3-->0
0-->3
2-->2

np.argsort provides us with this mapping:

print(np.argsort(B))
# [3 1 2 0]
print(np.argsort(B)[1])
# 1
print(np.argsort(B)[3])
# 0
print(np.argsort(B)[0])
# 3
print(np.argsort(B)[2])
# 2

print(np.argsort(B)[index])
# [[1 1 0 3 2]
#  [0 2 3 2 1]
#  [2 2 0 0 1]
#  [3 3 1 2 0]]

So, as a one-liner, the answer is:

np.argsort(B)[np.sort(B).searchsorted(A)]

Calling both np.sort(B) and np.argsort(B) is inefficient since both operations amount to sorting B. For any 1D-array B,

np.sort(B) == B[np.argsort(B)]

So we can compute the desired result a bit faster using

key=np.argsort(B)
result=key[B[key].searchsorted(A)]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜