开发者

How can I improve performance on my clojure sieve of eratosthenes algorithm?

I'm learning clojure by going through project euler and am working on problem number 10 (find the sum of all the prime number below two million. I implemented a pretty literal algorithm for the sieve of eratosthenes but it works far too slowly to be useful for up to two million. I tried implement开发者_StackOverflow社区ing it with loop-recur to not create any new frames but that didn't have a big impact on performance.

(defn find-primes-sum [last-prime nums]
    (loop [p last-prime n nums sum 0]
        (println p)
        (if (empty? n)
        sum
            (recur (first n) (doall (remove #(zero? (mod % (first n))) n)) (+ sum (first n))))))

(defn sieve-primes-until [limit]
    (find-primes-sum 2 (filter odd? (range 2 (inc limit)))))

(println (sieve-primes-until 2000000))


(set! *unchecked-math* true)

(defmacro iloop [[b t n] & body]
  `(loop [~@b]
     (when ~t
       ~@body
       (recur ~n))))

(defn count-primes [^long n]
  (let [c (inc n)
        ^booleans prime? (make-array Boolean/TYPE c)]
    (iloop [(i 2) (<= i n) (inc i)]
      (aset prime? i true))
    (iloop [(i 2) (<= (* i i) n) (inc i)]
      (if (aget prime? i)
        (iloop [(j i) (<= (* i j) n) (inc j)]
          (aset prime? (* i j) false))))
    (areduce prime? i r 0
      (if (aget prime? i)
        (inc r)
        r))))

This version targets Clojure 1.3.0 alpha. It counts primes up to 1e8 in 2 seconds on my machine. It can be easily altered to collect them. It was originally written to show that you can implement the sieve so that it runs as fast as the comparable Java.

http://dosync.posterous.com/lispers-know-the-value-of-everything-and-the


Personally, I'd structure the code differently. I've got an implementation of this problem that generates a lazy seq of all primes first, then sums the first 2,000,000 items. Takes 16 seconds on clojure 1.2, but it's hardly optimized except for using recur.

You're also doing a lot of unnecessary comparing with your input range; (remove #(zero? (mod % (first n))) n) tests all candidates against all odd numbers, which is nonsense**, especially when you force it with doall. You actually only have to test candidate prime x against all known primes <= sqrt(x) and discard the candidate when you find your first match.

** I just noticed your algorithm is not that stupid, but I'd still recommend you rewrite your algorithm as a lazy seq finds "the next prime" given a seq of previous primes and a candidate number.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜