How to account for column-contiguous array when extending numpy with C

2023-01-30 07:27 问答作者：

I have a C-function to normalize the rows of an array in log-space (this prevents numerical underflow).

The prototype of my C-function is as follows:

void normalize_logspace_matrix(size_t nrow, size_t ncol, double* mat);

You can see that it takes a pointer to an array and modifies it in place. The C-code of course assumes the data is saved as a C-contiguous array, i.e. row-contiguous.

I wrap the function as follows using Cython (imports and cdef extern from omitted):

def normalize_logspace(np.ndarray[np.double_t, ndim=2] mat):
    cdef Py_ssize_t n, d
    n = mat.shape[0]
    d = mat.shape[1]
    normalize_logspace_matrix(n, d, <double*> mat.data)
    return mat

Most of the time numpy-arrays are row-contiguous and the function works fine. However, if a numpy-array has been previously transposed the data is not copied around but just a new view into the data is returned. In this case my function fails because the array is no longer row-contiguous.

I can get around this by defining the array to have Fortran-contiguous order, such that after transposing it will be C-contiguous:

A = np.array([some_func(d) for d in range(D)], order='F').T
A = normalize_logspace(A)

Obviously that's very error-prone and the user has to take care that the array is in the correct order which is something that the user shouldn't need to care about in Python.

What's the best way how I can make this work with both row- and column-contiguous arr开发者_C百科ays? I assume that some kind of array-order checking in Cython is the way to go. Of course, I'd prefer a solution that doesn't require to copy the data into a new array, but I almost assume that's necessary.

If you want to support arrays in C and Fortran order without ever copying, your C function needs to be flexible enough to support both orderings. This can be achieved by passing the strides of the NumPy array to the C function: Change the prototype to

void normalize_logspace_matrix(size_t nrow, size_t ncol, 
                               size_t rowstride, size_t colstride,
                               double* mat);

and the Cython call to

def normalize_logspace(np.ndarray[np.double_t, ndim=2] mat):
    cdef Py_ssize_t n, d, rowstride, colstride
    n = mat.shape[0]
    d = mat.shape[1]
    rowstride = mat.strides[0] // mat.itemsize
    colstride = mat.strides[1] // mat.itemsize
    normalize_logspace_matrix(n, d, rowstride, colstride, <double*> mat.data)
    return mat

Then, replace every occurence of mat[row*ncol + col] in your C code by mat[row*rowstride + col*colstride].

In this case you really do want to create a copy of the input array (which can be a view on a real array) with guaranteed row-contiguous order. You can achieve this with something like this:

a = numpy.array(A, copy=True, order='C')

Also, consider taking a look at the exact array interface of Numpy (there is C part too).

+1 to Sven, whose answer solves the gotcha (well, got me) that dstack returns an F_contiguous array ?!

# don't use dstack to stack a,a,a -> rgb for a C func

import sys
import numpy as np

h = 2
w = 4
dim = 3
exec( "\n".join( sys.argv[1:] ))  # run this.py h= ...

a = np.arange( h*w, dtype=np.uint8 ) .reshape((h,w))
rgb = np.empty( (h,w,dim), dtype=np.uint8 )
rgb[:,:,0] = rgb[:,:,1] = rgb[:,:,2] = a
print "rgb:", rgb
print "rgb.flags:", rgb.flags  # C_contiguous
print "rgb.strides:", rgb.strides  # (12, 3, 1)

dstack = np.dstack(( a, a, a ))
print "dstack:", dstack
print "dstack.flags:", dstack.flags  # F_contiguous
print "dstack.strides:", dstack.strides  # (1, 2, 8)

继续阅读：c cython numpy python

How to account for column-contiguous array when extending numpy with C

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？