开发者

F# - Facebook Hacker Cup - Double Squares

I'm working on strengthening my F#-fu and decided to tackle the Facebook Hacker Cup Double Squares problem. I'm having some problems with the run-time and was wondering if anyone could help me figure out why it is so much slower than my C# equivalent.

There's a good description from another post;

Source: Facebook Hacker Cup Qualification Round 2011

A double-square number is an integer X which can be expressed as the sum of two perfect squares. For example, 10 is a double-square because 10 = 3^2 + 1^2. Given X, how can we determine the number of ways in which it can be written as the sum of two squares? For example, 10 can only be written as 3^2 + 1^2 (we don't count 1^2 + 3^2 as being different). On the other hand, 25 can be written as 5^2 + 0^2 or as 4^2 + 3^2.

You need to solve this problem for 0 ≤ X ≤ 2,147,483,647.

Examples:

10 => 1

25 => 2

3 => 0

0 => 1

1 => 1

Numbers From Competition

1740798996

1257431873

2147483643

602519112

858320077

1048039120

415485223

874566596

1022907856

65

421330820

1041493518

5

1328649093

1941554117

4225

2082925

0

1

3

My basic strategy (which I'm open to critique on) is to;

  1. Create a dictionary (for memoize) of the input numbers initialzed to 0
  2. Get the largest number (LN) and pass it to count/memo function
  3. Get the LN square root as int
  4. Calculate squares for all numbers 0 to LN and store in dict
  5. Sum squares for non repeat combinations of numbers from 0 to LN

    • If sum is in memo dict, add 1 to memo
  6. Finally, output the counts of the original numbers.

Here is the F# code (See code changes at bottom) I've written that I believe corresponds to this strategy (Runtime: ~8:10);

open System
open System.Collections.Generic
open System.IO

/// Get a sequence of values
let rec range min max = 
    seq { for num in [min .. max] do yield num }

/// Get a sequence starting from 0 and going to max
let rec zeroRange max = range 0 max    

/// Find the maximum number in a list with a starting accumulator (acc)
let rec maxNum acc = function
    | [] -> acc
    | p::tail when p > acc -> maxNum p tail
    | p::tail -> maxNum acc tail

/// A helper for finding max that sets the accumulator to 0
let rec findMax nums = maxNum 0 nums

/// Build a collection of combinations; ie [1,2,3] = (1,1), (1,2), (1,3), (2,2), (2,3), (3,3)
let rec combos range =    
    seq { 
          let count = ref 0
          for inner in range do 
              for outer in Seq.skip !count range do 
                  yield (inner, outer)
              count := !count + 1          
        }

let rec squares nums = 
    let dict = new Dictionary<int, int>()
    for s in nums do
        dict.[s] <- (s * s)
    dict

/// Counts the number of possible double squares for a given number and keeps track of other counts that are provided in the memo dict.
let rec countDoubleSquares (num: int) (memo: Dictionary<int, int>) =
    // The highest relevent square is the square root because it squared plus 0 squared is the top most possibility
    let maxSquare = System.Math.Sqrt((float)num)

    // Our relevant squares are 0 to the highest possible square; note the cast to int which shouldn't hurt.
    let relSquares = range 0 ((int)maxSquare)

    // calculate the squares up front;
    let calcSquares = squares relSquares

    // Build up our square combinations; ie [1,2,3] = (1,1), (1,2), (1,3), (2,2), (2,3), (3,3)
    for (sq1, sq2) in combos relSquares do
        let v = calcSquares.[sq1] + calcSquares.[sq2]
        // Memoize our relevant results
        if memo.ContainsKey(v) then            
            memo.[v] <- memo.[v] + 1

    // return our count for the num passed in
    memo.[num]


// Read our numbers from file.
//let lines = File.ReadAllLines("test2.txt")
//let nums = [ for line in Seq.skip 1 lines -> Int32.Parse(line) ]

// Optionally, read them from straight array
let nums = [1740798996; 1257431873; 2147483643; 602519112; 858320077; 1048039120; 415485223; 874566596; 1022907856; 65; 421330820; 1041493518; 5; 1328649093; 1941554117; 4225; 2082925; 0; 1; 3]

// Initialize our memoize dictionary
let memo = new Dictionary<int, int>()
for num in nums do
    memo.[num] <- 0

// Get the largest number in our set, all other numbers will be memoized along the way
let maxN = findMax nums

// Do the memoize
let maxCount = countDoubleSquares maxN memo

// Output our results.
for num in nums do
    printfn "%i" memo.[num]

// Have a little pause for when we debug
let line = Console.Read()

And here 开发者_开发技巧is my version in C# (Runtime: ~1:40:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;

namespace FBHack_DoubleSquares
{
    public class TestInput
    {
        public int NumCases { get; set; }
        public List<int> Nums { get; set; }

        public TestInput()
        {
            Nums = new List<int>();
        }

        public int MaxNum()
        {
            return Nums.Max();
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            // Read input from file.
            //TestInput input = ReadTestInput("live.txt");

            // As example, load straight.
            TestInput input = new TestInput
            {
                NumCases = 20,
                Nums = new List<int>
                {
                    1740798996,
                    1257431873,
                    2147483643,
                    602519112,
                    858320077,
                    1048039120,
                    415485223,
                    874566596,
                    1022907856,
                    65,
                    421330820,
                    1041493518,
                    5,
                    1328649093,
                    1941554117,
                    4225,
                    2082925,
                    0,
                    1,
                    3,
                }
            };

            var maxNum = input.MaxNum();

            Dictionary<int, int> memo = new Dictionary<int, int>();
            foreach (var num in input.Nums)
            {
                if (!memo.ContainsKey(num))
                    memo.Add(num, 0);
            }

            DoMemoize(maxNum, memo);

            StringBuilder sb = new StringBuilder();
            foreach (var num in input.Nums)
            {
                //Console.WriteLine(memo[num]);
                sb.AppendLine(memo[num].ToString());
            }

            Console.Write(sb.ToString());

            var blah = Console.Read();
            //File.WriteAllText("out.txt", sb.ToString());
        }

        private static int DoMemoize(int num, Dictionary<int, int> memo)
        {
            var highSquare = (int)Math.Floor(Math.Sqrt(num));

            var squares = CreateSquareLookup(highSquare);
            var relSquares = squares.Keys.ToList();

            Debug.WriteLine("Starting - " + num.ToString());
            Debug.WriteLine("RelSquares.Count = {0}", relSquares.Count);

            int sum = 0;
            var index = 0;            
            foreach (var square in relSquares)
            {
                foreach (var inner in relSquares.Skip(index))
                {
                    sum = squares[square] + squares[inner];
                    if (memo.ContainsKey(sum))
                        memo[sum]++;
                }
                index++;
            }

            if (memo.ContainsKey(num))
                return memo[num];

            return 0;            
        }

        private static TestInput ReadTestInput(string fileName)
        {
            var lines = File.ReadAllLines(fileName);
            var input = new TestInput();
            input.NumCases = int.Parse(lines[0]);
            foreach (var lin in lines.Skip(1))
            {
                input.Nums.Add(int.Parse(lin));
            }

            return input;
        }

        public static Dictionary<int, int> CreateSquareLookup(int maxNum)
        {
            var dict = new Dictionary<int, int>();
            int square;
            foreach (var num in Enumerable.Range(0, maxNum))
            {
                square = num * num;
                dict[num] = square;
            }

            return dict;
        }
    }   
}

Thanks for taking a look.

UPDATE

Changing the combos function slightly will result in a pretty big performance boost (from 8 min to 3:45):

/// Old and Busted...
let rec combosOld range =    
    seq { 
          let rangeCache = Seq.cache range
          let count = ref 0
          for inner in rangeCache do 
              for outer in Seq.skip !count rangeCache do 
                  yield (inner, outer)
              count := !count + 1          
        }

/// The New Hotness...
let rec combos maxNum =    
    seq {
        for i in 0..maxNum do
            for j in i..maxNum do
                yield i,j } 


Again, the number of integer solutions of x^2 + y^2 = k is either

  • zero, if there is a prime factor equal to 3 mod 4
  • four times the number of prime divisors of k which are equal to 1 mod 4.

Note that in the second alternative, you count a^2 + b^2 as a different solution to (-a)^2 + b^2 (and other signs), and to b^2 + a^2. So you may want to divide by four, and divide again by 2 (taking the floor, as @Wei Hu points out) if you want solutions as sets instead of ordered pairs.

Knowing this, writing a program which gives the number of solutions is easy: compute prime numbers up to 46341 once and for all.

Given k, compute the prime divisors of k by using the above list (test up to sqrt(k)). Count the ones which are equal to 1 mod 4, and sum. Multiply answer by 4 if needed.

All this is a one or two liner in any lazy functional language (I don't know enough f#, in Haskell it would be two lines long), once you have a primes infinite sequence: counting the divisors = 1 mod 4 (filterby |> count or something along these lines) is something quite natural.

I suspect it is faster than brute forcing the decompositions.


Your F# combos function has awful perf. Something like

let rec combos range =
    let a = range |> Seq.toArray
    seq {
        for i in 0..a.Length-1 do
            for j in i..a.Length-1 do
                yield i,j } 

should be a big speedup.


I like sequences, but in this case, they are probably the wrong tool for the job. This problem is a chance to try out a non-trivial recursive solution. Using mutation, it would be easy to do an algorithm like this (in Python, my pseudo-code of choice..)

def f(n):
    i = 0
    j = int(1 + sqrt(n))
    count = 0
    # 'i' will always be increased, and j will always be decreased.  We
    # will stop if i > j, so we can avoid duplicate pairs.
    while i <= j:
        s = i * i + j * j
        if s < n:
            # if any answers exist for this i, they were with higher
            # j values.  So, increment i.
            i +=  1
        elif s > n:
            # likewise, if there was an answer with this j, it was
            # found with a smaller i.  so, decrement it.
            j -= 1
        else:
            # found a solution.  Count it, and then move both i and 
            # j, because they will be found in at most one solution pair.
            count += 1
            i += 1
            j -= 1
    return count    

Now, this seems to work. Maybe its not right, or not the best, but I like how the recursive code looks in F#. (warning.. I don't have F# on this computer... but I hope I got this right.)

let f n = 
    let rec go i j count =
        if i > j then count
        else
            let s = i * i + j * j
            if s < n then 
                go (i + 1) j count
            else if s > n then
                go i (j - 1) count
            else
                go (i + 1) (j - 1) (count + 1)
    go 0 (1 + (n |> float |> sqrt |> int)) 0

This solution runs in O(sqrt(N)) time for each call, and requires constant memory. The memoizing solution takes O(N) time to setup the dictionary and the dictionary size is at least O(sqrt(N)). For large N, these are quite different.


EDIT

Given the C# code, the biggest difference is that the C# code is looping over a list, while the F# code is iterating over a sequence. The Seq.cache is only really a help when you are actually doing calculations over the seq, in this case, it doesn't avoid walking over the sequence repeatedly.

This function will be more like the C# code, but it requires the input to be an array, not just any sequence.

But, as you say elsewhere, a better algorithm is needed overall.

let combos (ss: int []) =
    let rec helper idx =
    seq {
        if idx < ss.Length then
            for jdx in idx + 1 .. ss.Length - 1 do
                 yield (ss.[idx], ss.[jdx])
            yield! helper (idx + 1)
        }
    helper 0

END EDIT

This may be part of why this is slower than equivalent C# code, although I think IEnumerable would have a similar problem.

let rec combos range =    
seq { 
      let count = ref 0
      for inner in range do 
          for outer in Seq.skip !count range do 
              yield (inner, outer)
          count := !count + 1          
    }

The double loop over range is causing it to be evaluated over and over again. Seq.cache can be used to avoid this, if the sequence must be looked at repeatedly.


This should help some.

// Build up our square combinations;
for (sq1, sq2) in combos maxSquare do
    let v = calcSquares.[sq1] + calcSquares.[sq2]

    // Memoize our relevant results
    match memo.TryGetValue v with
    | true, value -> memo.[v] <- value + 1
    | _ -> ()


I've been using F# for Facebook hacker cup. Here is my solution which finishes in less than 0.5 seconds. While I agree that you should do as functional as you can, but sometimes going imperative is a better approach, especially in a time-constrained coding competition.

let int_sqrt (n:int) :int =
    int (sqrt (float n))

let double_square (x: int) :int =
    let mutable count = 0
    let square_x = int_sqrt x
    for i in 0..square_x do
        let y = int_sqrt (x - i*i)
        if y*y + i*i = x && i<=y then count <- count + 1
    count


Without checking the code, your algorithm is sub-optimal. I solved this with the following:

Let N be the number we're testing.
Let X,Y such that X^2 + Y^2 = N
For all 0 <= X < sqrt(N)
   Let Ysquared = N - X^2
   if( Ysquared > Xsquared ) break; //prevent duplicate solutions
   if( isInteger( sqrt(ySquared) ) )
      //count unique solution


It's an improvement of Stefan Kendall's answer. Consider: N=a^2+b^2 and a>=b>=0 Then: sqrt(N/2)<=a<=sqrt(N) in reals. To get an integer interval must be rounded up the first term (sqrt(N/2)) and rounded down the last term (sqrt(N)).

public class test001 {
    public static void main(String[] args) {
        int k=0,min=0,max=0;
        double[] list={1740798996,1257431873,2147483643,
        602519112,858320077,1048039120,415485223,874566596,
        1022907856,65,421330820,1041493518,5,1328649093,
        1941554117,4225,2082925,0,1,3};
        for(int i=0 ; i<=list.length-1 ; i++){
            if (Math.sqrt(list[i]/2)%1==0)
                min=(int)Math.sqrt(list[i]/2);
            else
                min=(int)(Math.sqrt(list[i]/2))+1;
                    //Rounded up
            max=(int)Math.sqrt(list[i]);
                    //Rounded down
            if(max>=min)
                for(int j=min ; j<=max ; j++)
                if(Math.sqrt(list[i]-j*j)%1==0) k++;
                //If sqrt(N-j^2) is an integer then "k" increases.
        System.out.println((int)list[i]+": "+k);
        k=0; 
        }
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜