F# - Facebook Hacker Cup - Double Squares
I'm working on strengthening my F#-fu and decided to tackle the Facebook Hacker Cup Double Squares problem. I'm having some problems with the run-time and was wondering if anyone could help me figure out why it is so much slower than my C# equivalent.
There's a good description from another post;
Source: Facebook Hacker Cup Qualification Round 2011
A double-square number is an integer X which can be expressed as the sum of two perfect squares. For example, 10 is a double-square because 10 = 3^2 + 1^2. Given X, how can we determine the number of ways in which it can be written as the sum of two squares? For example, 10 can only be written as 3^2 + 1^2 (we don't count 1^2 + 3^2 as being different). On the other hand, 25 can be written as 5^2 + 0^2 or as 4^2 + 3^2.
You need to solve this problem for 0 ≤ X ≤ 2,147,483,647.
Examples:
10 => 1
25 => 2
3 => 0
0 => 1
1 => 1
Numbers From Competition
1740798996
1257431873 2147483643 602519112 858320077 1048039120 415485223 874566596 1022907856 65 421330820 1041493518 5 1328649093 1941554117 4225 2082925 0 1 3
My basic strategy (which I'm open to critique on) is to;
- Create a dictionary (for memoize) of the input numbers initialzed to 0
- Get the largest number (LN) and pass it to count/memo function
- Get the LN square root as int
- Calculate squares for all numbers 0 to LN and store in dict
- Sum squares for non repeat combinations of numbers from 0 to LN
- If sum is in memo dict, add 1 to memo
- Finally, output the counts of the original numbers.
Here is the F# code (See code changes at bottom) I've written that I believe corresponds to this strategy (Runtime: ~8:10);
open System
open System.Collections.Generic
open System.IO
/// Get a sequence of values
let rec range min max =
seq { for num in [min .. max] do yield num }
/// Get a sequence starting from 0 and going to max
let rec zeroRange max = range 0 max
/// Find the maximum number in a list with a starting accumulator (acc)
let rec maxNum acc = function
| [] -> acc
| p::tail when p > acc -> maxNum p tail
| p::tail -> maxNum acc tail
/// A helper for finding max that sets the accumulator to 0
let rec findMax nums = maxNum 0 nums
/// Build a collection of combinations; ie [1,2,3] = (1,1), (1,2), (1,3), (2,2), (2,3), (3,3)
let rec combos range =
seq {
let count = ref 0
for inner in range do
for outer in Seq.skip !count range do
yield (inner, outer)
count := !count + 1
}
let rec squares nums =
let dict = new Dictionary<int, int>()
for s in nums do
dict.[s] <- (s * s)
dict
/// Counts the number of possible double squares for a given number and keeps track of other counts that are provided in the memo dict.
let rec countDoubleSquares (num: int) (memo: Dictionary<int, int>) =
// The highest relevent square is the square root because it squared plus 0 squared is the top most possibility
let maxSquare = System.Math.Sqrt((float)num)
// Our relevant squares are 0 to the highest possible square; note the cast to int which shouldn't hurt.
let relSquares = range 0 ((int)maxSquare)
// calculate the squares up front;
let calcSquares = squares relSquares
// Build up our square combinations; ie [1,2,3] = (1,1), (1,2), (1,3), (2,2), (2,3), (3,3)
for (sq1, sq2) in combos relSquares do
let v = calcSquares.[sq1] + calcSquares.[sq2]
// Memoize our relevant results
if memo.ContainsKey(v) then
memo.[v] <- memo.[v] + 1
// return our count for the num passed in
memo.[num]
// Read our numbers from file.
//let lines = File.ReadAllLines("test2.txt")
//let nums = [ for line in Seq.skip 1 lines -> Int32.Parse(line) ]
// Optionally, read them from straight array
let nums = [1740798996; 1257431873; 2147483643; 602519112; 858320077; 1048039120; 415485223; 874566596; 1022907856; 65; 421330820; 1041493518; 5; 1328649093; 1941554117; 4225; 2082925; 0; 1; 3]
// Initialize our memoize dictionary
let memo = new Dictionary<int, int>()
for num in nums do
memo.[num] <- 0
// Get the largest number in our set, all other numbers will be memoized along the way
let maxN = findMax nums
// Do the memoize
let maxCount = countDoubleSquares maxN memo
// Output our results.
for num in nums do
printfn "%i" memo.[num]
// Have a little pause for when we debug
let line = Console.Read()
And here 开发者_开发技巧is my version in C# (Runtime: ~1:40:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
namespace FBHack_DoubleSquares
{
public class TestInput
{
public int NumCases { get; set; }
public List<int> Nums { get; set; }
public TestInput()
{
Nums = new List<int>();
}
public int MaxNum()
{
return Nums.Max();
}
}
class Program
{
static void Main(string[] args)
{
// Read input from file.
//TestInput input = ReadTestInput("live.txt");
// As example, load straight.
TestInput input = new TestInput
{
NumCases = 20,
Nums = new List<int>
{
1740798996,
1257431873,
2147483643,
602519112,
858320077,
1048039120,
415485223,
874566596,
1022907856,
65,
421330820,
1041493518,
5,
1328649093,
1941554117,
4225,
2082925,
0,
1,
3,
}
};
var maxNum = input.MaxNum();
Dictionary<int, int> memo = new Dictionary<int, int>();
foreach (var num in input.Nums)
{
if (!memo.ContainsKey(num))
memo.Add(num, 0);
}
DoMemoize(maxNum, memo);
StringBuilder sb = new StringBuilder();
foreach (var num in input.Nums)
{
//Console.WriteLine(memo[num]);
sb.AppendLine(memo[num].ToString());
}
Console.Write(sb.ToString());
var blah = Console.Read();
//File.WriteAllText("out.txt", sb.ToString());
}
private static int DoMemoize(int num, Dictionary<int, int> memo)
{
var highSquare = (int)Math.Floor(Math.Sqrt(num));
var squares = CreateSquareLookup(highSquare);
var relSquares = squares.Keys.ToList();
Debug.WriteLine("Starting - " + num.ToString());
Debug.WriteLine("RelSquares.Count = {0}", relSquares.Count);
int sum = 0;
var index = 0;
foreach (var square in relSquares)
{
foreach (var inner in relSquares.Skip(index))
{
sum = squares[square] + squares[inner];
if (memo.ContainsKey(sum))
memo[sum]++;
}
index++;
}
if (memo.ContainsKey(num))
return memo[num];
return 0;
}
private static TestInput ReadTestInput(string fileName)
{
var lines = File.ReadAllLines(fileName);
var input = new TestInput();
input.NumCases = int.Parse(lines[0]);
foreach (var lin in lines.Skip(1))
{
input.Nums.Add(int.Parse(lin));
}
return input;
}
public static Dictionary<int, int> CreateSquareLookup(int maxNum)
{
var dict = new Dictionary<int, int>();
int square;
foreach (var num in Enumerable.Range(0, maxNum))
{
square = num * num;
dict[num] = square;
}
return dict;
}
}
}
Thanks for taking a look.
UPDATE
Changing the combos function slightly will result in a pretty big performance boost (from 8 min to 3:45):
/// Old and Busted...
let rec combosOld range =
seq {
let rangeCache = Seq.cache range
let count = ref 0
for inner in rangeCache do
for outer in Seq.skip !count rangeCache do
yield (inner, outer)
count := !count + 1
}
/// The New Hotness...
let rec combos maxNum =
seq {
for i in 0..maxNum do
for j in i..maxNum do
yield i,j }
Again, the number of integer solutions of x^2 + y^2 = k is either
- zero, if there is a prime factor equal to 3 mod 4
- four times the number of prime divisors of k which are equal to 1 mod 4.
Note that in the second alternative, you count a^2 + b^2 as a different solution to (-a)^2 + b^2 (and other signs), and to b^2 + a^2. So you may want to divide by four, and divide again by 2 (taking the floor, as @Wei Hu points out) if you want solutions as sets instead of ordered pairs.
Knowing this, writing a program which gives the number of solutions is easy: compute prime numbers up to 46341 once and for all.
Given k, compute the prime divisors of k by using the above list (test up to sqrt(k)). Count the ones which are equal to 1 mod 4, and sum. Multiply answer by 4 if needed.
All this is a one or two liner in any lazy functional language (I don't know enough f#, in Haskell it would be two lines long), once you have a primes
infinite sequence: counting the divisors = 1 mod 4 (filterby |> count
or something along these lines) is something quite natural.
I suspect it is faster than brute forcing the decompositions.
Your F# combos
function has awful perf. Something like
let rec combos range =
let a = range |> Seq.toArray
seq {
for i in 0..a.Length-1 do
for j in i..a.Length-1 do
yield i,j }
should be a big speedup.
I like sequences, but in this case, they are probably the wrong tool for the job. This problem is a chance to try out a non-trivial recursive solution. Using mutation, it would be easy to do an algorithm like this (in Python, my pseudo-code of choice..)
def f(n):
i = 0
j = int(1 + sqrt(n))
count = 0
# 'i' will always be increased, and j will always be decreased. We
# will stop if i > j, so we can avoid duplicate pairs.
while i <= j:
s = i * i + j * j
if s < n:
# if any answers exist for this i, they were with higher
# j values. So, increment i.
i += 1
elif s > n:
# likewise, if there was an answer with this j, it was
# found with a smaller i. so, decrement it.
j -= 1
else:
# found a solution. Count it, and then move both i and
# j, because they will be found in at most one solution pair.
count += 1
i += 1
j -= 1
return count
Now, this seems to work. Maybe its not right, or not the best, but I like how the recursive code looks in F#. (warning.. I don't have F# on this computer... but I hope I got this right.)
let f n =
let rec go i j count =
if i > j then count
else
let s = i * i + j * j
if s < n then
go (i + 1) j count
else if s > n then
go i (j - 1) count
else
go (i + 1) (j - 1) (count + 1)
go 0 (1 + (n |> float |> sqrt |> int)) 0
This solution runs in O(sqrt(N)) time for each call, and requires constant memory. The memoizing solution takes O(N) time to setup the dictionary and the dictionary size is at least O(sqrt(N)). For large N, these are quite different.
EDIT
Given the C# code, the biggest difference is that the C# code is looping over a list, while the F# code is iterating over a sequence. The Seq.cache is only really a help when you are actually doing calculations over the seq, in this case, it doesn't avoid walking over the sequence repeatedly.
This function will be more like the C# code, but it requires the input to be an array, not just any sequence.
But, as you say elsewhere, a better algorithm is needed overall.
let combos (ss: int []) =
let rec helper idx =
seq {
if idx < ss.Length then
for jdx in idx + 1 .. ss.Length - 1 do
yield (ss.[idx], ss.[jdx])
yield! helper (idx + 1)
}
helper 0
END EDIT
This may be part of why this is slower than equivalent C# code, although I think IEnumerable would have a similar problem.
let rec combos range =
seq {
let count = ref 0
for inner in range do
for outer in Seq.skip !count range do
yield (inner, outer)
count := !count + 1
}
The double loop over range is causing it to be evaluated over and over again. Seq.cache
can be used to avoid this, if the sequence must be looked at repeatedly.
This should help some.
// Build up our square combinations;
for (sq1, sq2) in combos maxSquare do
let v = calcSquares.[sq1] + calcSquares.[sq2]
// Memoize our relevant results
match memo.TryGetValue v with
| true, value -> memo.[v] <- value + 1
| _ -> ()
I've been using F# for Facebook hacker cup. Here is my solution which finishes in less than 0.5 seconds. While I agree that you should do as functional as you can, but sometimes going imperative is a better approach, especially in a time-constrained coding competition.
let int_sqrt (n:int) :int =
int (sqrt (float n))
let double_square (x: int) :int =
let mutable count = 0
let square_x = int_sqrt x
for i in 0..square_x do
let y = int_sqrt (x - i*i)
if y*y + i*i = x && i<=y then count <- count + 1
count
Without checking the code, your algorithm is sub-optimal. I solved this with the following:
Let N be the number we're testing.
Let X,Y such that X^2 + Y^2 = N
For all 0 <= X < sqrt(N)
Let Ysquared = N - X^2
if( Ysquared > Xsquared ) break; //prevent duplicate solutions
if( isInteger( sqrt(ySquared) ) )
//count unique solution
It's an improvement of Stefan Kendall's answer. Consider: N=a^2+b^2 and a>=b>=0 Then: sqrt(N/2)<=a<=sqrt(N) in reals. To get an integer interval must be rounded up the first term (sqrt(N/2)) and rounded down the last term (sqrt(N)).
public class test001 {
public static void main(String[] args) {
int k=0,min=0,max=0;
double[] list={1740798996,1257431873,2147483643,
602519112,858320077,1048039120,415485223,874566596,
1022907856,65,421330820,1041493518,5,1328649093,
1941554117,4225,2082925,0,1,3};
for(int i=0 ; i<=list.length-1 ; i++){
if (Math.sqrt(list[i]/2)%1==0)
min=(int)Math.sqrt(list[i]/2);
else
min=(int)(Math.sqrt(list[i]/2))+1;
//Rounded up
max=(int)Math.sqrt(list[i]);
//Rounded down
if(max>=min)
for(int j=min ; j<=max ; j++)
if(Math.sqrt(list[i]-j*j)%1==0) k++;
//If sqrt(N-j^2) is an integer then "k" increases.
System.out.println((int)list[i]+": "+k);
k=0;
}
}
}
精彩评论