开发者

1D Fast Convolution without FFT

I need an 1D Convolution against 2 big arrays. I'm using this code in C# but it takes a loooong time to run.

I know, i know! FFT convolutions is very fast. But in this project i CAN'T use it. It is a constraint of the project to not use FFT (please don't ask why :/).

This is my code in C# (ported fro开发者_如何学编程m matlab, by the way):

var result = new double[input.Length + filter.Length - 1];
for (var i = 0; i < input.Length; i++)
{
    for (var j = 0; j < filter.Length; j++)
    {
        result[i + j] += input[i] * filter[j];
    }
}

So, anyone knows any fast convolution algorithm widthout FFT?


Convolution is numerically the same as a polynomial multiplication with an extra wrap-around step. Therefore, all the polynomial and large integer multiplication algorithms can be used to perform convolution.

FFT is the only way to get the fast O(n log(n)) run-time. But you can still get sub-quadratic run-time using the divide-and-conquer approaches like Karatsuba's algorithm.

Karatsuba's algorithm is fairly easy to implement once you understand how it works. It runs in O(n^1.585), and will probably be faster than trying to super-optimize the classic O(n^2) approach.


You could reduce the number of indexed accesses to result, as well as the Length properties:

int inputLength = filter.Length;
int filterLength = filter.Length;
var result = new double[inputLength + filterLength - 1];
for (int i = resultLength; i >= 0; i--)
{
    double sum = 0;
    // max(i - input.Length + 1,0)
    int n1 = i < inputLength ? 0 : i - inputLength + 1;
    // min(i, filter.Length - 1)
    int n2 = i < filterLength ? i : filterLength - 1;
    for (int j = n1; j <= n2; j++)
    {
        sum += input[i - j] * filter[j];
    }
    result[i] = sum;
}

If you further split the outer loop, you can get rid of some repeating conditionals. (This assumes 0 < filterLengthinputLengthresultLength)

int inputLength = filter.Length;
int filterLength = filter.Length;
int resultLength = inputLength + filterLength - 1;

var result = new double[resultLength];

for (int i = 0; i < filterLength; i++)
{
    double sum = 0;
    for (int j = i; j >= 0; j--)
    {
        sum += input[i - j] * filter[j];
    }
    result[i] = sum;
}
for (int i = filterLength; i < inputLength; i++)
{
    double sum = 0;
    for (int j = filterLength - 1; j >= 0; j--)
    {
        sum += input[i - j] * filter[j];
    }
    result[i] = sum;
}
for (int i = inputLength; i < resultLength; i++)
{
    double sum = 0;
    for (int j = i - inputLength + 1; j < filterLength; j++)
    {
        sum += input[i - j] * filter[j];
    }
    result[i] = sum;
}


You can use a special IIR filter. Then process that as like:

y(n)= a1*y(n-1)+b1*y(n-2)...+a2*x(n-1)+b2*x(n-2)......

I think it's faster.


Here are two possibilities that may give minor speedups, but you'd need to test to be sure.

  1. Unroll the inner loop to remove some tests. This will be easier if you know the filter length will always be a multiple if some N.
  2. Reverse the order of the loops. Do filter.length passes over the whole array. This does less dereferencing in the inner loop but may have worse caching behavior.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜