Extending Numpy with C function

2023-01-27 12:24 问答作者：

I am trying to speed up my Numpy code and decided that I wanted to implement one particular function where my code spent most of the time in C.

I'm actually a rookie in C, but I managed to write the function which normalizes every row in a matrix to sum to 1. I can compile it and I tested it with some data (in C) and it does what I want. At that point I was very proud of myself.

Now I'm trying to call my glorious function from Python where it should accept a 2d-Numpy array.

The various things I've tried are

SWIG
SWIG + numpy.i
ctypes

My function has the prototype

void normalize_logspace_matrix(size_t nrow, size_t ncol, double mat[nrow][ncol]);

So it takes a pointer to a variable-length array and modifies it in place.

I tried the following pure SWIG interface file:

%module c_utils

%{
extern void normalize_logspace_matrix(size_t, size_t, double mat[*][*]);
%}

extern void normalize_logspace_matrix(size_t, size_t, double** mat);

Then I would do (on Mac OS X 64bit):

> swig -python c-utils.i

> gcc -fPIC c-utils_wrap.c -o c-utils_wrap.o \
     -I/Library/Frameworks/Python.framework/Versions/6.2/include/python2.6/ \
     -L/Library/Frameworks/Python.framework/Versions/6.2/lib/python2.6/ -c
c-utils_wrap.c: In function ‘_wrap_normalize_logspace_matrix’:
c-utils_wrap.c:2867: warning: passing argument 3 of ‘normalize_logspace_matrix’ from   incompatible pointer type

> g++ -dynamiclib c-utils.o -o _c_utils.so

In Python I then get the following error on importing my module:

>>> import c_utils
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: dynamic module does not define init function (initc_utils)

Next I tried this approach using SWIG + numpy.i:

%module c_utils

%{
#define SWIG_FILE_WITH_INIT
#include "c-utils.h"
%}
%include "numpy.i"
%init %{
import_array();
%}

%apply ( int DIM1, int DIM2, DATA_TYPE* INPLACE_ARRAY2 ) 
       {(size_t nrow, size_t ncol, double* mat)};

%include "c-utils.h"

However, I don't get any further than this:

> swig -python c-utils.i
c-utils.i:13: Warning 453: Can't apply (int DIM1,int DIM2,DATA_TYPE *INPLACE_ARRAY2). No typemaps are defined.

SWIG doesn't seem to find the 开发者_C百科typemaps defined in numpy.i, but I don't understand why, because numpy.i is in the same directory and SWIG doesn't complain that it can't find it.

With ctypes I didn't get very far, but got lost in the docs pretty quickly since I couldn't figure out how to pass it a 2d-array and then get the result back.

So could somebody show me the magic trick how to make my function available in Python/Numpy?

Unless you have a really good reason not to, you should use cython to interface C and python. (We are starting to use cython instead of raw C inside numpy/scipy themselves).

You can see a simple example in my scikits talkbox (since cython has improved quite a bit since then, I think you could write it better today).

def cslfilter(c_np.ndarray b, c_np.ndarray a, c_np.ndarray x):
    """Fast version of slfilter for a set of frames and filter coefficients.
    More precisely, given rank 2 arrays for coefficients and input, this
    computes:

    for i in range(x.shape[0]):
        y[i] = lfilter(b[i], a[i], x[i])

    This is mostly useful for processing on a set of windows with variable
    filters, e.g. to compute LPC residual from a signal chopped into a set of
    windows.

    Parameters
    ----------
        b: array
            recursive coefficients
        a: array
            non-recursive coefficients
        x: array
            signal to filter

    Note
    ----

    This is a specialized function, and does not handle other types than
    double, nor initial conditions."""

    cdef int na, nb, nfr, i, nx
    cdef double *raw_x, *raw_a, *raw_b, *raw_y
    cdef c_np.ndarray[double, ndim=2] tb
    cdef c_np.ndarray[double, ndim=2] ta
    cdef c_np.ndarray[double, ndim=2] tx
    cdef c_np.ndarray[double, ndim=2] ty

    dt = np.common_type(a, b, x)

    if not dt == np.float64:
        raise ValueError("Only float64 supported for now")

    if not x.ndim == 2:
        raise ValueError("Only input of rank 2 support")

    if not b.ndim == 2:
        raise ValueError("Only b of rank 2 support")

    if not a.ndim == 2:
        raise ValueError("Only a of rank 2 support")

    nfr = a.shape[0]
    if not nfr == b.shape[0]:
        raise ValueError("Number of filters should be the same")

    if not nfr == x.shape[0]:
        raise ValueError, \
              "Number of filters and number of frames should be the same"

    tx = np.ascontiguousarray(x, dtype=dt)
    ty = np.ones((x.shape[0], x.shape[1]), dt)

    na = a.shape[1]
    nb = b.shape[1]
    nx = x.shape[1]

    ta = np.ascontiguousarray(np.copy(a), dtype=dt)
    tb = np.ascontiguousarray(np.copy(b), dtype=dt)

    raw_x = <double*>tx.data
    raw_b = <double*>tb.data
    raw_a = <double*>ta.data
    raw_y = <double*>ty.data

    for i in range(nfr):
        filter_double(raw_b, nb, raw_a, na, raw_x, nx, raw_y)
        raw_b += nb
        raw_a += na
        raw_x += nx
        raw_y += nx

    return ty

As you can see, besides the usual argument checking you would do in python, it is almost the same thing (filter_double is a function which can be written in pure C in a separate library if you want to). Of course, since it is compiled code, failing to check your argument will crash your interpreter instead of raising exception (there are several levels of safety vs speed tradeoffs available with recent cython, though).

To answer the real question: SWIG doesn't tell you it can't find any typemaps. It tells you it can't apply the typemap (int DIM1,int DIM2,DATA_TYPE *INPLACE_ARRAY2), which is because there is no typemap defined for DATA_TYPE *. You need to tell it you want to apply it to a double*:

%apply ( int DIM1, int DIM2, double* INPLACE_ARRAY2 ) 
       {(size_t nrow, size_t ncol, double* mat)};

First, are you sure that you were writing the fastest possible numpy code? If by normalise you mean divide the whole row by its sum, then you can write fast vectorised code which looks something like this:

matrix /= matrix.sum(axis=0)

If this is not what you had in mind and you are still sure that you need a fast C extension, I would strongly recommend you write it in cython instead of C. This will save you all the overhead and difficulties in wrapping code, and allow you to write something which looks like python code but which can be made to run as fast as C in most circumstances.

I agree with others that a little Cython is well worth learning. But if you must write C or C++, use a 1d array which overlays the 2d, like this:

// sum1rows.cpp: 2d A as 1d A1
// Unfortunately
//     void f( int m, int n, double a[m][n] ) { ... }
// is valid c but not c++ .
// See also
// http://stackoverflow.com/questions/3959457/high-performance-c-multi-dimensional-arrays
// http://stackoverflow.com/questions/tagged/multidimensional-array c++

#include <stdio.h>

void sum1( int n, double x[] )  // x /= sum(x)
{
    float sum = 0;
    for( int j = 0; j < n; j ++  )
        sum += x[j];
    for( int j = 0; j < n; j ++  )
        x[j] /= sum;
}

void sum1rows( int nrow, int ncol, double A1[] )  // 1d A1 == 2d A[nrow][ncol]
{
    for( int j = 0; j < nrow*ncol; j += ncol  )
        sum1( ncol, &A1[j] );
}

int main( int argc, char** argv )
{
    int nrow = 100, ncol = 10;
    double A[nrow][ncol];
    for( int j = 0; j < nrow; j ++ )
    for( int k = 0; k < ncol; k ++ )
        A[j][k] = (j+1) * k;

    double* A1 = &A[0][0];  // A as 1d array -- bad practice
    sum1rows( nrow, ncol, A1 );

    for( int j = 0; j < 2; j ++ ){
        for( int k = 0; k < ncol; k ++ ){
            printf( "%.2g ", A[j][k] );
        }
        printf( "\n" );
    }
}

Added 8 Nov: as you probably know, numpy.reshape can overlay a numpy 2d array with a 1d view to pass to sum1rows, like this:

import numpy as np
A = np.arange(10).reshape((2,5))
A1 = A.reshape(A.size)  # a 1d view of A, not a copy
# sum1rows( 2, 5, A1 )
A[1,1] += 10
print "A:", A
print "A1:", A1

SciPy has an extension tutorial with example code for arrays. http://docs.scipy.org/doc/numpy/user/c-info.how-to-extend.html

继续阅读：c cython numpy python swig

Extending Numpy with C function

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？