开发者

Composing an average stream piecewise

I have a list of n floating point streams each having a different size.

The streams can be be composed together using the following rules:

You can put a stream starting at any point in time (its zero before it started). You can use the same stream few times (it can overlap itself and even be in the same position few times) and you are allowed to not use a certain stream at all.

e.g.:

input streams:

1 2 3 4
2 4 5 6 7
1 5 6

Can be composed like:

  1 2 3 4
1 5 6
        1 5 6

After the placements an output stream is composed by the rule that each 开发者_JAVA技巧output float equals to the square root of the sum of the square of each term.

e.g.:

If the streams at a position are:

1
2
3

The output is:

sqrt(1*1 + 2*2 + 3*3) = sqrt(14) = 3.74...

So for the the example composition:

  1 2 3 4
1 5 6
        1 5 6

The output is:

1 5.09 6.32 3 4.12 5 6

What I have is the output stream and the input streams. I need to compute the composition that lead to that output. an exact composition doesn't have to exists - I need a composition as close as possible to the output (smallest accumulated difference).

e.g.:

Input:

Stream to mimic:

1 5.09 6.32 3 4.12 5 6

and a list:

1 2 3 4
2 4 5 6 7
1 5 6

Expected output:

Stream 0 starting at 1,
Stream 2 starting at 0,
Stream 2 starting at 4.

This seems like an NP problem, is there any fast way to solve this? it can be somewhat brute force (but not totally, its not theoretic problem) and it can give not the best answer as long as its close enough.

The algorithm will be usually used with stream to mimic with very long length (can be few megabytes) while it will have around 20 streams to be composed from, while each stream will be around kilobyte long.


I think you can speed up a greedy search a bit over the obvious. First of all, square each element in all of the streams involved. Then you are looking for a sum of squared streams that looks a lot like the squared target stream. Suppose that "it looks like" is the euclidean distance between the squared streams, considered as vectors.

Then we have (a-b)^2 = a^2 + b^2 - 2a.b. So if we can find the dot product of two vectors quickly, and we know their absolute size, we can find the distance quickly. But using the FFT and the http://en.wikipedia.org/wiki/Convolution_theorem, we can work out a.b_i where a is the target stream and b_i is stream b at some offset of i, by using the FFT to convolve a reversed version of b - for the cost of doing an FFT on a, an FFT on reversed b, and an FFT on the result, we get a.b_i for every offset i.

If we do a greedy search, the first step will be to find the b_i that makes (a-b_i)^2 smallest and subtract it from a. Then we are looking for a stream c_j that makes (a-b_i-c_j)^2 as small as possible. But this is a^2 + b_i^2 + c_j^2 - 2a.b_i - 2a.c_j + 2b_i.c_j and we have already calculated everything except b_i.c_j in the step above. If b and c are shorter streams it will be cheap to calculate b_i.c_j, and we can use the FFT as before.

So we have a not too horrible way to do a greedy search - at each stage subtract off the stream from the adjusted target stream so far that makes the residual smallest (considered as vectors in euclidean space), and carry on from there. At some stage we will find that none of the streams we have available make the residual any smaller. We can stop there, because our calculation above shows us that using two streams at once won't help either then - this follows because b_i.c_j >= 0, since each element of b_i is >= 0, because it is a square.

If you do a greedy search and are not satisfied, but have more cpu to burn, try Limited Discrepancy Search.


If I can use C#, LINQ & the Rx framework's System.Interactive extensions then this works:

First up - define a jagged array for the allowable arrays.

int[][] streams =
    new []
    {
        new [] { 1, 2, 3, 4, },
        new [] { 2, 4, 5, 6, 7, },
        new [] { 1, 5, 6, },
    };

Need an infinite iterator on integers to represent each step.

IEnumerable<int> steps =
    EnumerableEx.Generate(0, x => true, x => x + 1, x => x);

Need a random number generator to randomly select which streams to add to each step.

var rnd = new Random();

In my LINQ query I've used these operators:

  • Scan^ - runs an accumulator function over a sequence producing an output value for every input value
  • Where - filters the sequence based on the predicate
  • Empty - returns an empty sequence
  • Concat - concatenates two sequences
  • Skip - skips over the specified number of elements in a sequence
  • Any - returns true if the sequence contains any elements
  • Select - projects the sequence using a selector function
  • Sum - sums the values in the sequence

^ - from the Rx System.Interactive library

Now for the LINQ query that does all of the hard work.

IEnumerable<double> results =
    steps
        // Randomly select which streams to add to this step
        .Scan(Enumerable.Empty<IEnumerable<int>>(), (xs, _) =>
            streams.Where(st => rnd.NextDouble() > 0.8).ToArray())
        // Create a list of "Heads" & "Tails" for each step
        // Heads are the first elements of the current streams in the step
        // Tails are the remaining elements to push forward to the next step
        .Scan(new
        {
            Heads = Enumerable.Empty<int>(),
            Tails = Enumerable.Empty<IEnumerable<int>>()
        }, (acc, ss) => new
        {
            Heads = acc.Tails.Concat(ss)
                .Select(s => s.First()),
            Tails = acc.Tails.Concat(ss)
                .Select(s => s.Skip(1)).Where(s => s.Any()),
        })
        // Keep the Heads only
        .Select(x => x.Heads)
        // Filter out any steps that didn't produce any values
        .Where(x => x.Any())
        // Calculate the square root of the sum of the squares
        .Select(x => System.Math.Sqrt((double)x.Select(y => y * y).Sum()));

Nice lazy evaluation per step - scary though...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜