Shuffling NumPy array along a given axis
Given the following NumPy array,
> a = array([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5],[1, 2, 3, 4, 5]])
it's simple enough to shuffle a single row,
> shuffle(a[0])
> a
array([[4, 2, 1, 3, 5],[1, 2, 3, 4, 5],[1, 2, 3, 4, 5]])
Is it possible to use indexing notation to shuffle each of the rows independently? Or do you have to iterate over the array. I had in min开发者_如何学编程d something like,
> numpy.shuffle(a[:])
> a
array([[4, 2, 3, 5, 1],[3, 1, 4, 5, 2],[4, 2, 1, 3, 5]]) # Not the real output
though this clearly doesn't work.
Vectorized solution with rand+argsort
trick
We could generate unique indices along the specified axis and index into the the input array with advanced-indexing
. To generate the unique indices, we would use random float generation + sort
trick, thus giving us a vectorized solution. We would also generalize it to cover generic n-dim
arrays and along generic axes
with np.take_along_axis
. The final implementation would look something like this -
def shuffle_along_axis(a, axis):
idx = np.random.rand(*a.shape).argsort(axis=axis)
return np.take_along_axis(a,idx,axis=axis)
Note that this shuffle won't be in-place and returns a shuffled copy.
Sample run -
In [33]: a
Out[33]:
array([[18, 95, 45, 33],
[40, 78, 31, 52],
[75, 49, 42, 94]])
In [34]: shuffle_along_axis(a, axis=0)
Out[34]:
array([[75, 78, 42, 94],
[40, 49, 45, 52],
[18, 95, 31, 33]])
In [35]: shuffle_along_axis(a, axis=1)
Out[35]:
array([[45, 18, 33, 95],
[31, 78, 52, 40],
[42, 75, 94, 49]])
You have to call numpy.random.shuffle()
several times because you are shuffling several sequences independently. numpy.random.shuffle()
works on any mutable sequence and is not actually a ufunc
. The shortest and most efficient code to shuffle all rows of a two-dimensional array a
separately probably is
list(map(numpy.random.shuffle, a))
Some people prefer to write this as a list comprehension instead:
[numpy.random.shuffle(x) for x in a]
For those looking at this question more recently, numpy
provides the permuted
method to shuffle an array independently along the specified axis.
From their documentation (using random.Generator
)
rng = np.random.default_rng()
x = np.arange(24).reshape(3, 8)
x
array([[ 0, 1, 2, 3, 4, 5, 6, 7],
[ 8, 9, 10, 11, 12, 13, 14, 15],
[16, 17, 18, 19, 20, 21, 22, 23]])
y = rng.permuted(x, axis=1)
y
array([[ 4, 3, 6, 7, 1, 2, 5, 0],
[15, 10, 14, 9, 12, 11, 8, 13],
[17, 16, 20, 21, 18, 22, 23, 19]])
精彩评论