开发者

Can the execution time of this prime number generator be improved?

My initial goal when writing this was to leave the smallest footprint possible. I can say with confidence that this goal has been met. Unfortunately, this leaves me with a rather slow implementation. To generate all primes below 2 million it takes about 8 seconds on a 3Ghz Intel chip.

Is there anyway to improve the executi开发者_高级运维on time of this code with minimal sacrifice to the small memory footprint? Alternatively, am I going about this the wrong way when looking at it from a functional standpoint?

CODE

/// 6.5s for max = 2,000,000
let generatePrimeNumbers max =    
    let rec generate number numberSequence =
        if number * number > max then numberSequence else
        let filteredNumbers = numberSequence |> Seq.filter (fun v -> v = number || v % number <> 0L)
        let newNumberSequence = seq { for i in filteredNumbers -> i }
        let newNumber = newNumberSequence |> Seq.find (fun x -> x > number)
        generate newNumber newNumberSequence                
    generate 2L (seq { for i in 2L..max -> i })

Update

I tweaked the algorithm and managed to shave off 2 seconds but double memory consumption.

/// 5.2s for max = 2,000,000
let generatePrimeNumbers max =    
    let rec generate number numberSequence =
        if number * number > max then numberSequence else
        let filteredNumbers = numberSequence |> Seq.filter (fun v -> v = number || v % number <> 0L) |> Seq.toArray |> Array.toSeq
        let newNumber = filteredNumbers |> Seq.find (fun v -> v > number)
        generate newNumber filteredNumbers                
    generate 2L (seq { for i in 2L..max -> i })

Update

Apparently, I was using an old compiler. With the latest version my original algorithm takes 6.5s rather than 8s. That is quite an improvement.


Tomas Petricek's function is pretty fast, but we can make it a little faster.

Compare the following:

let is_prime_tomas n =
    let ms = int64(sqrt(float(n)))
    let rec isPrimeUtil(m) =
        if m > ms then true
        elif n % m = 0L then false
        else isPrimeUtil(m + 1L)
    (n > 1L) && isPrimeUtil(2L)

let is_prime_juliet n =
    let maxFactor = int64(sqrt(float n))
    let rec loop testPrime tog =
        if testPrime > maxFactor then true
        elif n % testPrime = 0L then false
        else loop (testPrime + tog) (6L - tog)
    if n = 2L || n = 3L || n = 5L then true
    elif n <= 1L || n % 2L = 0L || n % 3L = 0L || n % 5L = 0L then false
    else loop 7L 4L

is_prime_juliet has a little slightly faster inner loop. Its a well-known prime-generating strategy which uses a "toggle" to skip non-primes in increments of 2 or 4. For comparison:

> seq { 2L .. 2000000L } |> Seq.filter is_prime_tomas |> Seq.fold (fun acc _ -> acc + 1) 0;;
Real: 00:00:03.628, CPU: 00:00:03.588, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 148933

> seq { 2L .. 2000000L } |> Seq.filter is_prime_juliet |> Seq.fold (fun acc _ -> acc + 1) 0;;
Real: 00:00:01.530, CPU: 00:00:01.513, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 148933

My version is about 2.37x faster, and its also pretty close to the speed of the fastest imperative versions. We can make it even faster because we don't need to filter the list of 2L .. 2000000L, we can use the same strategy to generate a more optimal sequence of possible primes before we apply our filter:

> let getPrimes upTo =
    seq {
        yield 2L;
        yield 3L;
        yield 5L;
        yield! (7L, 4L) |> Seq.unfold (fun (p, tog) -> if p <= upTo then Some(p, (p + tog, 6L - tog)) else None)
    }
    |> Seq.filter is_prime_juliet;;
Real: 00:00:00.000, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0

val getPrimes : int64 -> seq<int64>

> getPrimes 2000000L |> Seq.fold (fun acc _ -> acc + 1) 0;;
Real: 00:00:01.364, CPU: 00:00:01.357, GC gen0: 36, gen1: 1, gen2: 0
val it : int = 148933

We dropped from 1.530s to 01.364s, so we gained about 1.21x more speed. Awesome!


Just for funsies, let's take a look at this page.

pi(x) is the prime counting function, it returns the number of primes below x. You can approximate pi(x) using the following inequalities:

(x/log x)(1 + 0.992/log x) < pi(x) < (x/log x)(1 + 1.2762/log x) 
// The upper bound holds for all x > 1

p(x) is the nth's prime function, which can be approximated using:

n ln n + n ln ln n - n < p(n) < n ln n + n ln ln n
// for n >= 6

With that in mind, here is a very fast generator which computes the first n primes, where each element at index i equal p(i). So, if we want to cap our array at primes below 2,000,000, we use the upperbound inequality for the prime counting function:

let rec is_prime (primes : int[]) i testPrime maxFactor =
    if primes.[i] > maxFactor then true
    else
        if testPrime % primes.[i] = 0 then false
        else is_prime primes (i + 1) testPrime maxFactor

let rec prime_n (primes : int[]) primeCount testPrime tog =
    if primeCount < primes.Length then
        let primeCount' =
            if is_prime primes 2 testPrime (float testPrime |> sqrt |> int) then
                primes.[primeCount] <- testPrime
                primeCount + 1
            else
                primeCount
        prime_n primes primeCount' (testPrime + tog) (6 - tog)

let getPrimes upTo =
    let x = float upTo
    let arraySize = int <| (x / log x) * (1.0 + 1.2762 / log x)
    let primes = Array.zeroCreate (max arraySize 3)
    primes.[0] <- 2
    primes.[1] <- 3
    primes.[2] <- 5

    prime_n primes 3 7 4
    primes

Cool! So how fast is it? On my 3.2ghz quad-core, I get the following in fsi:

> let primes = getPrimes 2000000;;
Real: 00:00:00.534, CPU: 00:00:00.546, GC gen0: 0, gen1: 0, gen2: 0

val primes : int [] =
  [|2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71;
    73; 79; 83; 89; 97; 101; 103; 107; 109; 113; 127; 131; 137; 139; 149; 151;
    157; 163; 167; 173; 179; 181; 191; 193; 197; 199; 211; 223; 227; 229; 233;
    239; 241; 251; 257; 263; 269; 271; 277; 281; 283; 293; 307; 311; 313; 317;
    331; 337; 347; 349; 353; 359; 367; 373; 379; 383; 389; 397; 401; 409; 419;
    421; 431; 433; 439; 443; 449; 457; 461; 463; 467; 479; 487; 491; 499; 503;
    509; 521; 523; 541; ...|]

> printfn "total primes: %i. Last prime: %i" (primes.Length - 1) primes.[primes.Length - 1];;
total primes: 149973. Last prime: 2014853

So that's all the primes around 2 million in less than half a second :)


EDIT: updated version below, uses less memory and is faster

Sometimes it's good to be able to mutate stuff. Here's a, admittedly rather imperative, version that trades memory for speed. Since this thread turned out to host a nice collection of prime generating functions in F#, I thought it would be nice to add mine anyway. Using a BitArray keeps memory hogging in check.

open System.Collections

let getPrimes nmax =
    let sieve = new BitArray(nmax+1, true)
    let result = new ResizeArray<int>(nmax/10)
    let upper = int (sqrt (float nmax))   
    result.Add(2)

    let mutable n = 3
    while n <= nmax do
       if sieve.[n] then
           if n<=upper then 
               let mutable i = n
               while i <= nmax do sieve.[i] <- false; i <- i + n
           result.Add n
       n <- n + 2
    result

Update:

The code above can be optimized further: since it only uses the odd indices in the sieve, the BitArray can be reduced to half the size by indexing odd n as 2m+1. The new version is also faster:

let getPrimes2 nmax =
    let sieve = new BitArray((nmax/2)+1, true)
    let result = new ResizeArray<int>(nmax/10)
    let upper = int (sqrt (float nmax))   
    if nmax>1 then result.Add(2) //fixes a subtle bug for nmax < 2

    let mutable m = 1
    while 2*m+1 <= nmax do
       if sieve.[m] then
           let n = 2*m+1
           if n <= upper then 
               let mutable i = m
               while 2*i < nmax do sieve.[i] <- false; i <- i + n
           result.Add n
       m <- m + 1
    result

Timing (intel core duo 2.33GHz):

> getPrimes 2000000 |> Seq.length;;
Real: 00:00:00.037, CPU: 00:00:00.046, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 148933
> getPrimes2 2000000 |> Seq.length;;
Real: 00:00:00.026, CPU: 00:00:00.031, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 148933


The imperative version posted by Yin is very fast. On my machine it is also about 0.5sec. However, if you want to write a simple functional solution you can just write this:

let isPrime(n) =
  let ms = int64(sqrt(float(n)))
  let rec isPrimeUtil(m) =
    if m > ms then true
    elif n % m = 0L then false
    else isPrimeUtil(m + 1L)
  (n > 1L) && isPrimeUtil(2L)

[ 1L .. 2000000L ] |> List.filter isPrime

This simply tests whether a number is a prime for all numbers to 1 milion. It doesn't use any sophisticated algorithms (it's actually funny that a solution which is simplest is often good enough!). On my machine, your updated version runs about 11 seconds and this runs about 2 seconds.

More interestingly, this is very easy to parallelize. If you use PLINQ you can write the version below and it will run almost 2 times faster on dual core. This means that on quad core, it could be as fast as the fastest solution from all the answers here, but with minimal programming effort :-) (of course, using four cores is not ecological, but.. well)

[ 1L .. 2000000L ] |> PSeq.ofSeq |> PSeq.filter isPrime |> List.ofSeq

The PSeq functions are wrappers for PLINQ that I created for my book (it makes using PLINQ from F# more natural). They are available in source code for Chapter 14.


I wrote an imperative version, which is faster. It may be impossible to write Sieve of Eratosthenes in a pure functional way to achieve the same speed since you must have a binary state for each number.

let generatePrimes max=
    let p = Array.create (max+1) true
    let rec filter i step = 
        if i <= max then 
            p.[i] <- false
            filter (i+step) step
    {2..int (sqrt (float max))} |> Seq.map (fun i->filter (i+i) i) |> Seq.length |> ignore
    {2..max} |> Seq.filter (fun i->p.[i]) |> Seq.toArray
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜