开发者

Sum of products of two arrays (dotproduct)

First off, I know my title can be formulated better, but my math classes are so far gone I can't remember the correct words anymore..

I need to do something like this (pseudo c#)

int[] digits1 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int[] digits2 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int result = digits1*digits2

This would be the sum of the product of element[开发者_开发技巧i] of each array.

This obviously doesn't work. Any suggestions towards either a better title or the solution?

EDIT clarification: I know I could loop them both and do the math. Basically I would think there is a better way to do this and I'm looking for it purely out of personal curiousity.


With LINQ:

int dotProduct = digits1.Zip(digits2, (d1, d2) => d1 * d2)
                        .Sum();

Zipwill produce a streaming sequence containing the products of corresponding elements from both arrays, which is then summed into an integer with Sum.

Note that this will not fail like it should when the arrays of unequal length, so you probably need to validate the input:

//null checks here

if(digits1.Length != digits2.Length)
   throw new ArgumentException("...");

EDIT: As Jeff M points out,Enumerable.Zipwas only added to the framework in .NET 4.0. In .NET 3.5, you can do this (the idea is only efficient for collections that expose fast indexers):

int dotProduct = Enumerable.Range(0, digits1.Length)
                           .Sum(i => digits1[i] * digits2[i]);

//from Jeff M's comment:
int dotProduct = digits1.Select((n, i) => n * digits2[i])
                        .Sum();


Solutions with LINQ

int[] digits1 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int[] digits2 = new int[10]{0,1,2,3,4,5,6,7,8,9};

int result1 = digits1.Zip(digits2, (x, y) => x * y).Sum();

int result2 = digits1.Select((x, y) => x * digits2.ElementAt(y)).Sum();

int result3 = digits1.Select((n, i) => n * digits2[i]).Sum();

// Ani answer
int result4 = Enumerable.Range(0, digits1.Length)
    .Sum(i => digits1[i] * digits2[i]);

Performance test 100000 iterations:

Queries
Fn: Result 1       Ticks 135306
Fn: Result 2       Ticks 2470614
Fn: Result 3       Ticks 130034
Fn: Result 4       Ticks 123374

-------------

Fastest
Fn: Result 4       Ticks 123374
Fn: Result 3       Ticks 130034
Fn: Result 1       Ticks 135306
Fn: Result 2       Ticks 2470614


I wrote a test bench to compare the these methods' times on my machine.

Specs:
Windows 7 Professional 64-bit
Intel Core 2 Quad Q9550 @ 2.83GHz
4x1GiB Corsair Dominator DDR2 1066 (PC2-8500)

using System;
using System.Linq;

namespace Testbench
{
    class Program
    {
        static void Main(string[] args)
        {
            var digits1 = Enumerable.Range(0, 500).ToArray();
            var digits2 = digits1.ToArray(); // create a copy

            Test("Regular Loop", () =>
            {
                int result = 0;
                for (int i = 0; i < digits1.Length; i++)
                {
                    result += digits1[i] * digits2[i];
                }
                return result;
            });

            // using LINQ
            Test("Enumerable \"Loop\"", () => Enumerable.Range(0, digits1.Length).Sum(i => digits1[i] * digits2[i]));
            Test("Using Zip", () => digits1.Zip(digits2, (x, y) => x * y).Sum());
            Test("Using Indexed Select", () => digits1.Select((n, i) => n * digits2[i]).Sum());
            Test("Using Indexed Select with ElementAt", () => digits1.Select((n, i) => n * digits2.ElementAt(i)).Sum());

            // using PLINQ
            Test("Parallel Enumerable \"Loop\"", () => ParallelEnumerable.Range(0, digits1.Length).Sum(i => digits1[i] * digits2[i]));
            Test("Using Parallel Zip", () => digits1.AsParallel().Zip(digits2.AsParallel(), (x, y) => x * y).Sum());
            Test("Using Parallel Indexed Select", () => digits1.AsParallel().Select((n, i) => n * digits2[i]).Sum());
            Test("Using Parallel Indexed Select with ElementAt", () => digits1.AsParallel().Select((n, i) => n * digits2.ElementAt(i)).Sum());

            Console.Write("Press any key to continue . . . ");
            Console.ReadKey(true);
            Console.WriteLine();
        }

        static void Test<T>(string testName, Func<T> test, int iterations = 1000000)
        {
            Console.WriteLine(testName);
            Console.WriteLine("Iterations: {0}", iterations);
            var results = Enumerable.Repeat(0, iterations).Select(i => new System.Diagnostics.Stopwatch()).ToList();
            var timer = System.Diagnostics.Stopwatch.StartNew();
            for (int i = 0; i < results.Count; i++)
            {
                results[i].Start();
                test();
                results[i].Stop();
            }
            timer.Stop();
            Console.WriteLine("Time(ms): {0,3}/{1,10}/{2,8} ({3,10})", results.Min(t => t.ElapsedMilliseconds), results.Average(t => t.ElapsedMilliseconds), results.Max(t => t.ElapsedMilliseconds), timer.ElapsedMilliseconds);
            Console.WriteLine("Ticks:    {0,3}/{1,10}/{2,8} ({3,10})", results.Min(t => t.ElapsedTicks), results.Average(t => t.ElapsedTicks), results.Max(t => t.ElapsedTicks), timer.ElapsedTicks);
            Console.WriteLine();
        }
    }
}

32-bit target:

Regular Loop
Iterations: 1000000
Time(ms):   0/         0/       0 (      1172)
Ticks:      3/  3.101365/     526 (   3244251)

Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   4.3E-05/      25 (      9054)
Ticks:     24/  24.93989/   69441 (  25052172)

Using Zip
Iterations: 1000000
Time(ms):   0/   2.4E-05/      16 (     16282)
Ticks:     41/ 44.941406/   45395 (  45052491)

Using Indexed Select
Iterations: 1000000
Time(ms):   0/   5.3E-05/      32 (     13473)
Ticks:     34/ 37.165088/   89602 (  37280177)

Using Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   1.5E-05/       6 (    160215)
Ticks:    405/443.154147/   17821 ( 443306156)

Parallel Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/  0.000103/      29 (     17194)
Ticks:     38/ 47.412312/   81906 (  47576133)

Using Parallel Zip
Iterations: 1000000
Time(ms):   0/   9.4E-05/      19 (     21703)
Ticks:     49/ 59.859005/   53200 (  60051081)

Using Parallel Indexed Select
Iterations: 1000000
Time(ms):   0/  0.000114/      27 (     20579)
Ticks:     45/ 56.758491/   75455 (  56943627)

Using Parallel Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   8.1E-05/      19 (     61137)
Ticks:    144/ 168.97909/   53320 ( 169165086)

64-bit target:

Regular Loop
Iterations: 1000000
Time(ms):   0/         0/       0 (       506)
Ticks:      1/  1.254137/    1491 (   1401969)

Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   2.9E-05/      15 (     10118)
Ticks:     27/ 27.850086/   41954 (  27995994)

Using Zip
Iterations: 1000000
Time(ms):   0/   2.2E-05/      13 (     17089)
Ticks:     45/ 47.132834/   38506 (  47284608)

Using Indexed Select
Iterations: 1000000
Time(ms):   0/   3.1E-05/      12 (     14057)
Ticks:     37/ 38.740923/   33846 (  38897274)

Using Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   3.8E-05/      29 (    117412)
Ticks:    304/324.711279/   82726 ( 324872753)

Parallel Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   9.9E-05/      28 (     24234)
Ticks:     38/  66.79389/   77578 (  67054956)

Using Parallel Zip
Iterations: 1000000
Time(ms):   0/  0.000111/      24 (     30193)
Ticks:     46/ 83.264037/   69029 (  83542711)

Using Parallel Indexed Select
Iterations: 1000000
Time(ms):   0/   6.5E-05/      20 (     28417)
Ticks:     45/ 78.337831/   56354 (  78628396)

Using Parallel Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   9.2E-05/      16 (     65233)
Ticks:    112/180.154663/   44799 ( 180496754)


Very simply, do a loop.

int sum = 0;
for(int i = 0; i < digits1.length && i < digits2.length; i++)
{
    sum += digits1[i] * digits2[i];
}

Boom.


int result = 0;
for(int i = 0; i < digits1.length; i++)
{
    result += digits1[i] * digits2[i];
}


Even faster is to unroll the loop

Test("Regular Loop", () =>
            {
                int result = 0;
                for (int i = 0; i < digits1.Length; i++)
                {
                    result += digits1[i] * digits2[i];
                }
                return result;
            });
            // This will fail if vectors are not a multiple of 4 in length.
            Test("Unroll 4x", () =>
            {
                int result = 0;
                for (int i = 0; i < digits1.Length; i+=4)
                {
                    result += digits1[i] * digits2[i];
                    result += digits1[i+1] * digits2[i+1];
                    result += digits1[i+2] * digits2[i+2];
                    result += digits1[i+3] * digits2[i+3];
                }
                return result;
            });

            Test("Dynamic unroll", () =>
            {
                int result = 0;
                int limit = (digits1.Length/8)*8;
                int reminderLimit = digits1.Length;

                if (digits1.Length >= 8)
                {
                    for (int i = 0; i < limit; i+=8)
                    {
                        result += digits1[i] * digits2[i];
                        result += digits1[i+1] * digits2[i+1];
                        result += digits1[i+2] * digits2[i+2];
                        result += digits1[i+3] * digits2[i+3];
                        result += digits1[i+4] * digits2[i+4];
                        result += digits1[i+5] * digits2[i+5];
                        result += digits1[i+6] * digits2[i+6];
                        result += digits1[i+7] * digits2[i+7];
                    }

                    reminderLimit = digits1.Length % 8;
                }

                switch(reminderLimit)
                {
                    case 7: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                result += digits1[limit+3] * digits2[limit+3];
                                result += digits1[limit+4] * digits2[limit+4];
                                result += digits1[limit+5] * digits2[limit+5];
                                result += digits1[limit+6] * digits2[limit+6];
                                break;
                            }

                    case 6: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                result += digits1[limit+3] * digits2[limit+3];
                                result += digits1[limit+4] * digits2[limit+4];
                                result += digits1[limit+5] * digits2[limit+5];      
                                break;
                            }

                    case 5: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                result += digits1[limit+3] * digits2[limit+3];
                                result += digits1[limit+4] * digits2[limit+4];
                                break;
                            }

                    case 4: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                result += digits1[limit+3] * digits2[limit+3];
                                break;
                            }

                    case 3: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                result += digits1[limit+2] * digits2[limit+2];
                                break;
                            }

                    case 2: {
                                result += digits1[limit] * digits2[limit];
                                result += digits1[limit+1] * digits2[limit+1];
                                break;
                            }

                    case 1: {
                                result += digits1[limit] * digits2[limit];
                                break;
                            }
                    default :
                            {
                                break;
                            }
                }

                return result;
            });

There is a huge difference is running time between debug and release mode of C# code, this is run with release mode:

Regular Loop
Iterations: 1000000
Time(ms):   0/         0/       0 (       596)
Ticks:      1/  2,071213/     455 (   2154248)

Unroll 4x
Iterations: 1000000
Time(ms):   0/     2E-06/       1 (       575)
Ticks:      1/  1,984301/    3876 (   2076105)

Dynamic unroll
Iterations: 1000000
Time(ms):   0/         0/       0 (       430)
Ticks:      1/    1,4635/    3228 (   1554830)

Debug mode:

Regular Loop
Iterations: 1000000
Time(ms):   0/     1E-06/       1 (      1296)
Ticks:      4/  4,529916/    3907 (   4678354)

Unroll 4x
Iterations: 1000000
Time(ms):   0/         0/       0 (       871)
Ticks:      2/  3,048466/     701 (   3145277)

Dynamic unroll
Iterations: 1000000
Time(ms):   0/         0/       0 (       819)
Ticks:      2/  2,858588/    1398 (   2957179)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜