Tutorial :Fast Prime Number Generation in Clojure


I've been working on solving Project Euler problems in Clojure to get better, and I've already run into prime number generation a couple of times. My problem is that it is just taking way too long. I was hoping someone could help me find an efficient way to do this in a Clojure-y way.

When I fist did this, I brute-forced it. That was easy to do. But calculating 10001 prime numbers took 2 minutes this way on a Xeon 2.33GHz, too long for the rules, and too long in general. Here was the algorithm:

(defn next-prime-slow      "Find the next prime number, checking against our already existing list"      ([sofar guess]          (if (not-any? #(zero? (mod guess %)) sofar)              guess                         ; Then we have a prime              (recur sofar (+ guess 2)))))  ; Try again                                   (defn find-primes-slow      "Finds prime numbers, slowly"      ([]          (find-primes-slow 10001 [2 3]))   ; How many we need, initial prime seeds      ([needed sofar]          (if (<= needed (count sofar))              sofar                         ; Found enough, we're done              (recur needed (concat sofar [(next-prime-slow sofar (last sofar))])))))  

By replacing next-prime-slow with a newer routine that took some additional rules into account (like the 6n +/- 1 property) I was able to speed things up to about 70 seconds.

Next I tried making a sieve of Eratosthenes in pure Clojure. I don't think I got all the bugs out, but I gave up because it was simply way too slow (even worse than the above, I think).

(defn clean-sieve      "Clean the sieve of what we know isn't prime based"      [seeds-left sieve]      (if (zero? (count seeds-left))          sieve              ; Nothing left to filter the list against          (recur              (rest seeds-left)    ; The numbers we haven't checked against              (filter #(> (mod % (first seeds-left)) 0) sieve)))) ; Filter out multiples    (defn self-clean-sieve  ; This seems to be REALLY slow      "Remove the stuff in the sieve that isn't prime based on it's self"      ([sieve]          (self-clean-sieve (rest sieve) (take 1 sieve)))      ([sieve clean]          (if (zero? (count sieve))              clean              (let [cleaned (filter #(> (mod % (last clean)) 0) sieve)]                  (recur (rest cleaned) (into clean [(first cleaned)]))))))    (defn find-primes      "Finds prime numbers, hopefully faster"      ([]          (find-primes 10001 [2]))      ([needed seeds]          (if (>= (count seeds) needed)              seeds        ; We have enough              (recur       ; Recalculate                  needed                  (into                      seeds    ; Stuff we've already found                      (let [start (last seeds)                              end-range (+ start 150000)]   ; NOTE HERE                          (reverse                                                                              (self-clean-sieve                              (clean-sieve seeds (range (inc start) end-range))))))))))  

This is bad. It also causes stack overflows if the number 150000 is smaller. This despite the fact I'm using recur. That may be my fault.

Next I tried a sieve, using Java methods on a Java ArrayList. That took quite a bit of time, and memory.

My latest attempt is a sieve using a Clojure hash-map, inserting all the numbers in the sieve then dissoc'ing numbers that aren't prime. At the end, it takes the key list, which are the prime numbers it found. It takes about 10-12 seconds to find 10000 prime numbers. I'm not sure it's fully debugged yet. It's recursive too (using recur and loop), since I'm trying to be Lispy.

So with these kind of problems, problem 10 (sum up all primes under 2000000) is killing me. My fastest code came up with the right answer, but it took 105 seconds to do it, and needed quite a bit of memory (I gave it 512 MB just so I wouldn't have to fuss with it). My other algorithms take so long I always ended up stopping them first.

I could use a sieve to calculate that many primes in Java or C quite fast and without using so much memory. I know I must be missing something in my Clojure/Lisp style that's causing the problem.

Is there something I'm doing really wrong? Is Clojure just kinda slow with large sequences? Reading some of the project Euler discussions people have calculated the first 10000 primes in other Lisps in under 100 miliseconds. I realize the JVM may slow things down and Clojure is relatively young, but I wouldn't expect a 100x difference.

Can someone enlighten me on a fast way to calculate prime numbers in Clojure?


Here's another approach that celebrates Clojure's Java interop. This takes 374ms on a 2.4 Ghz Core 2 Duo (running single-threaded). I let the efficient Miller-Rabin implementation in Java's BigInteger#isProbablePrime deal with the primality check.

(def certainty 5)    (defn prime? [n]        (.isProbablePrime (BigInteger/valueOf n) certainty))    (concat [2] (take 10001      (filter prime?         (take-nth 2            (range 1 Integer/MAX_VALUE)))))  

The Miller-Rabin certainty of 5 is probably not very good for numbers much larger than this. That certainty is equal to 96.875% certain it's prime (1 - .5^certainty)


I realize this is a very old question, but I recently ended up looking for the same and the links here weren't what I'm looking for (restricted to functional types as much as possible, lazily generating ~every~ prime I want).

I stumbled upon a nice F# implementation, so all credits are his. I merely ported it to Clojure:

(defn gen-primes "Generates an infinite, lazy sequence of prime numbers"    []    (let [reinsert (fn [table x prime]                     (update-in table [(+ prime x)] conj prime))]      (defn primes-step [table d]                   (if-let [factors (get table d)]                     (recur (reduce #(reinsert %1 d %2) (dissoc table d) factors)                            (inc d))                     (lazy-seq (cons d (primes-step (assoc table (* d d) (list d))                                                   (inc d))))))      (primes-step {} 2)))  

Usage is simply

(->>    (gen-primes)    (filter #(< % 2000000)    ; do what you need with the stuff  )  


See the last example here: http://clojuredocs.org/clojure_core/clojure.core/lazy-seq

;; An example combining lazy sequences with higher order functions  ;; Generate prime numbers using Eratosthenes Sieve  ;; See http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes  ;; Note that the starting set of sieved numbers should be  ;; the set of integers starting with 2 i.e., (iterate inc 2)   (defn sieve [s]    (cons (first s)          (lazy-seq (sieve (filter #(not= 0 (mod % (first s)))                                   (rest s))))))    user=> (take 20 (sieve (iterate inc 2)))  (2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71)  


Very late to the party, but I'll throw in an example, using Java BitSets:

(defn sieve [n]    "Returns a BitSet with bits set for each prime up to n"    (let [bs (new java.util.BitSet n)]      (.flip bs 2 n)      (doseq [i (range 4 n 2)] (.clear bs i))      (doseq [p (range 3 (Math/sqrt n))]        (if (.get bs p)          (doseq [q (range (* p p) n (* 2 p))] (.clear bs q))))      bs))  

Running this on a 2014 Macbook Pro (2.3GHz Core i7), I get:

user=> (time (do (sieve 1e6) nil))  "Elapsed time: 64.936 msecs"  


Here's a nice and simple implementation:


... but it is written for some pre-1.0 version of Clojure. See lazy_seqs in Clojure Contrib for one that works with the current version of the language.


(defn sieve    [[p & rst]]    ;; make sure the stack size is sufficiently large!    (lazy-seq (cons p (sieve (remove #(= 0 (mod % p)) rst)))))    (def primes (sieve (iterate inc 2)))  

with a 10M stack size, I get the 1001th prime in ~ 33 seconds on a 2.1Gz macbook.


So I've just started with Clojure, and yeah, this comes up a lot on Project Euler doesn't it? I wrote a pretty fast trial division prime algorithm, but it doesn't really scale too far before each run of divisions becomes prohibitively slow.

So I started again, this time using the sieve method:

(defn clense    "Walks through the sieve and nils out multiples of step"    [primes step i]    (if (<= i (count primes))      (recur         (assoc! primes i nil)        step        (+ i step))      primes))    (defn sieve-step    "Only works if i is >= 3"    [primes i]    (if (< i (count primes))      (recur        (if (nil? (primes i)) primes (clense primes (* 2 i) (* i i)))        (+ 2 i))      primes))    (defn prime-sieve    "Returns a lazy list of all primes smaller than x"    [x]    (drop 2       (filter (complement nil?)      (persistent! (sieve-step         (clense (transient (vec (range x))) 2 4) 3)))))  

Usage and speed:

user=> (time (do (prime-sieve 1E6) nil))  "Elapsed time: 930.881 msecs  

I'm pretty happy with the speed: it's running out of a REPL running on a 2009 MBP. It's mostly fast because I completely eschew idiomatic Clojure and instead loop around like a monkey. It's also 4X faster because I'm using a transient vector to work on the sieve instead of staying completely immutable.

Edit: After a couple of suggestions / bug fixes from Will Ness it now runs a whole lot faster.


Here's a simple sieve in Scheme:


Here's a run for primes up to 10,000:

#;1> (include "primes-up-to.scm")  ; including primes-up-to.scm ...  #;2> ,t (primes-up-to 10000)  0.238s CPU time, 0.062s GC time (major), 180013 mutations, 130/4758 GCs (major/minor)  (2 3 5 7 11 13...  


Based on Will's comment, here is my take on postponed-primes:

(defn postponed-primes-recursive    ([]       (concat (list 2 3 5 7)               (lazy-seq (postponed-primes-recursive                          {}                          3                          9                          (rest (rest (postponed-primes-recursive)))                          9))))    ([D p q ps c]       (letfn [(add-composites                 [D x s]                 (loop [a x]                   (if (contains? D a)                     (recur (+ a s))                     (persistent! (assoc! (transient D) a s)))))]         (loop [D D                p p                q q                ps ps                c c]           (if (not (contains? D c))             (if (< c q)               (cons c (lazy-seq (postponed-primes-recursive D p q ps (+ 2 c))))               (recur (add-composites D                                      (+ c (* 2 p))                                      (* 2 p))                      (first ps)                      (* (first ps) (first ps))                      (rest ps)                      (+ c 2)))             (let [s (get D c)]               (recur (add-composites                       (persistent! (dissoc! (transient D) c))                       (+ c s)                       s)                      p                      q                      ps                      (+ c 2))))))))  

Initial submission for comparison:

Here is my attempt to port this prime number generator from Python to Clojure. The below returns an infinite lazy sequence.

(defn primes    []    (letfn [(prime-help              [foo bar]              (loop [D foo                     q bar]                (if (nil? (get D q))                  (cons q (lazy-seq                           (prime-help                            (persistent! (assoc! (transient D) (* q q) (list q)))                            (inc q))))                  (let [factors-of-q (get D q)                        key-val (interleave                                 (map #(+ % q) factors-of-q)                                 (map #(cons % (get D (+ % q) (list)))                                      factors-of-q))]                    (recur (persistent!                            (dissoc!                             (apply assoc! (transient D) key-val)                             q))                           (inc q))))))]      (prime-help {} 2)))  


user=> (first (primes))  2  user=> (second (primes))  3  user=> (nth (primes) 100)  547  user=> (take 5 (primes))  (2 3 5 7 11)  user=> (time (nth (primes) 10000))  "Elapsed time: 409.052221 msecs"  104743  


Performance comparison, where postponed-primes uses a queue of primes seen so far rather than a recursive call to postponed-primes:

user=> (def counts (list 200000 400000 600000 800000))  #'user/counts  user=> (map #(time (nth (postponed-primes) %)) counts)  ("Elapsed time: 1822.882 msecs"   "Elapsed time: 3985.299 msecs"   "Elapsed time: 6916.98 msecs"   "Elapsed time: 8710.791 msecs"  2750161 5800139 8960467 12195263)  user=> (map #(time (nth (postponed-primes-recursive) %)) counts)  ("Elapsed time: 1776.843 msecs"   "Elapsed time: 3874.125 msecs"   "Elapsed time: 6092.79 msecs"   "Elapsed time: 8453.017 msecs"  2750161 5800139 8960467 12195263)  


From: http://steloflute.tistory.com/entry/Clojure-%ED%94%84%EB%A1%9C%EA%B7%B8%EB%9E%A8-%EC%B5%9C%EC%A0%81%ED%99%94

Using Java array

(defmacro loopwhile [init-symbol init whilep step & body]    `(loop [~init-symbol ~init]       (when ~whilep ~@body (recur (+ ~init-symbol ~step)))))    (defn primesUnderb [limit]    (let [p (boolean-array limit true)]      (loopwhile i 2 (< i (Math/sqrt limit)) 1                 (when (aget p i)                   (loopwhile j (* i 2) (< j limit) i (aset p j false))))      (filter #(aget p %) (range 2 limit))))  

Usage and speed:

user=> (time (def p (primesUnderb 1e6)))  "Elapsed time: 104.065891 msecs"  


I just started using Clojure so I don't know if it's good but here is my solution:

(defn divides? [x i]    (zero? (mod x i)))    (defn factors [x]      (flatten (map #(list % (/ x %)) (filter #(divides? x %) (range 1 (inc (Math/floor (Math/sqrt x))))))))    (defn prime? [x]    (empty? (filter #(and divides? (not= x %) (not= 1 %)) (factors x))))    (def primes     (filter prime? (range 2 java.lang.Integer/MAX_VALUE)))    (defn sum-of-primes-below [n]    (reduce + (take-while #(< % n) primes)))  


After coming to this thread and searching for a faster alternative to those already here, I am surprised nobody linked to the following article by Christophe Grand :

(defn primes3 [max]    (let [enqueue (fn [sieve n factor]                    (let [m (+ n (+ factor factor))]                      (if (sieve m)                        (recur sieve m factor)                        (assoc sieve m factor))))          next-sieve (fn [sieve candidate]                       (if-let [factor (sieve candidate)]                         (-> sieve                           (dissoc candidate)                           (enqueue candidate factor))                         (enqueue sieve candidate candidate)))]      (cons 2 (vals (reduce next-sieve {} (range 3 max 2))))))  

As well as a lazy version :

(defn lazy-primes3 []    (letfn [(enqueue [sieve n step]              (let [m (+ n step)]                (if (sieve m)                  (recur sieve m step)                  (assoc sieve m step))))            (next-sieve [sieve candidate]              (if-let [step (sieve candidate)]                (-> sieve                  (dissoc candidate)                  (enqueue candidate step))                (enqueue sieve candidate (+ candidate candidate))))            (next-primes [sieve candidate]              (if (sieve candidate)                (recur (next-sieve sieve candidate) (+ candidate 2))                (cons candidate                   (lazy-seq (next-primes (next-sieve sieve candidate)                               (+ candidate 2))))))]      (cons 2 (lazy-seq (next-primes {} 3)))))  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »