undo or reverse argsort(), python
Given an array 'a' I would like to sort the array by columns sort(a, axis=0)
do some stuff to the array and then undo the sort. By that I don't mean re sort but basically reversing how each element was moved. I assume argsort()
is what I need but it is not clear to me how to sort an array with the results of argsort()
or more importantly apply the reverse/inverse of argsort()
Here is a little more detail
I have an array a
, shape开发者_C百科(a) = rXc
I need to sort each column
aargsort = a.argsort(axis=0) # May use this later
aSort = a.sort(axis=0)
now average each row
aSortRM = asort.mean(axis=1)
now replace each col in a row with the row mean. is there a better way than this
aWithMeans = ones_like(a)
for ind in range(r) # r = number of rows
aWithMeans[ind]* aSortRM[ind]
Now I need to undo the sort I did in the first step. ????
There are probably better solutions to the problem you are actually trying to solve than this (performing an argsort usually precludes the need to actually sort), but here you go:
>>> import numpy as np
>>> a = np.random.randint(0,10,10)
>>> aa = np.argsort(a)
>>> aaa = np.argsort(aa)
>>> a # original
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
>>> a[aa] # sorted
array([0, 2, 4, 4, 4, 4, 5, 6, 6, 7])
>>> a[aa][aaa] # undone
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
For all those still looking for an answer:
In [135]: r = rand(10)
In [136]: i = argsort(r)
In [137]: r_sorted = r[i]
In [138]: i_rev = zeros(10, dtype=int)
In [139]: i_rev[i] = arange(10)
In [140]: allclose(r, r_sorted[i_rev])
Out[140]: True
I'm not sure how best to do it in numpy
, but, in pure Python, the reasoning would be:
aargsort
is holding a permutation of range(len(a))
telling you where the items of aSort
came from -- much like, in pure Python:
>>> x = list('ciaobelu')
>>> r = range(len(x))
>>> r.sort(key=x.__getitem__)
>>> r
[2, 4, 0, 5, 1, 6, 3, 7]
>>>
i.e., the first argument of sorted(x)
will be x[2]
, the second one x[4]
, and so forth.
So given the sorted version, you can reconstruct the original by "putting items back where they came from":
>>> s = sorted(x)
>>> s
['a', 'b', 'c', 'e', 'i', 'l', 'o', 'u']
>>> original = [None] * len(s)
>>> for i, c in zip(r, s): original[i] = c
...
>>> original
['c', 'i', 'a', 'o', 'b', 'e', 'l', 'u']
>>>
Of course there are going to be tighter and faster ways to express this in numpy
(which unfortunately I don't know inside-out as much as I know Python itself;-), but I hope this helps by showing the underlying logic of the "putting things back in place" operation you need to perform.
Super late to the game, but here:
import numpy as np
N = 1000 # or any large integer
x = np.random.randn( N )
I = np.argsort( x )
J = np.argsort( I )
print( np.allclose( x[I[J]] , x ) )
>> True
Basically, argsort the argsort because the nth element of the reverse sort is J[n] = k : I[k] = n. That is, I[J[n]] = n, so J sorts I.
indices=np.argsort(a)
gives you the sorting indices, such that x = a[indices]
is the sorted array. y = b[indices]
pushs forward the array b
to sorted domain. c[indices] = z
pulls back z
from sorted domain into c
in source domain.
For instance,
import numpy as np
n = 3
a = np.random.randint(0,10,n) # [1, 5, 2]
b = np.random.randn(n) # [-.1, .5, .2]
c = np.empty_like(a)
# indices that sort a: x=a[indices], x==np.sort(a) is all True
indices = np.argsort(a) # [0,2,1]
# y[i] is the value in b at the index of the i-th smallest value in a
y = b[indices] # [-.1, .2, .5]
# say z[i] is some value related to the i-th smallest entry in a
z = np.random.randn(n) # [-1.1, 1.2, 1.3]
c[indices] = z # inverted the sorting map, c = [-1.1, 1.3, 1.2]
I was not able to follow your example, but the more abstract problem--i.e., how to sort an array then reverse the sort--is straightforward.
import numpy as NP
# create an 10x6 array to work with
A = NP.random.randint(10, 99, 60).reshape(10, 6)
# for example, sort this array on the second-to-last column,
# breaking ties using the second column (numpy requires keys in
# "reverse" order for some reason)
keys = (A[:,1], A[:,4])
ndx = NP.lexsort(keys, axis=0)
A_sorted = NP.take(A, ndx, axis=0)
To "reconstruct" A from A_sorted is trivial because remember that you used an index array ('ndx') to sort the array in the first place.
# ndx array for example above: array([6, 9, 8, 0, 1, 2, 4, 7, 3, 5])
In other words, the 4th row in A_sorted was the 1st row in the original array, A, etc.
精彩评论