Can anyone perhaps teach me how to further optimize this 'print up to the nth prime number' script? [closed]
开发者_JS百科
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this questionI'm a 17 year old getting started with programming with the help of the Python programming language.
I've been seeking to optimize this algorithm, perhaps by eliminating one of the loops, or with a better test to check for prime numbers.
Trying to calculate and display 100000 prime numbers has the script pausing for about 6 seconds as it populates the list with primes before the primes list is returned to the console as output.
I've been experimenting with using
print odd,
to simply print every found prime number, which is faster for smaller inputs like n = 1000, but for n = 1000000 the list itself prints much faster (both in the python shell and in the console).
Perhaps the entire code/algorithm should be revamped, but the script should remain essentially the same: The user types in the number of prime numbers to be printed (n) and the script returns all prime numbers up to the nth prime number.
from time import time
odd = 1
primes = [2]
n = input("Number of prime numbers to print: ")
clock = time()
def isPrime(number):
    global primes
    for i in primes:
        if i*i > number:
            return True
        if number%i is 0:
            return False
while len(primes) < n:
    odd += 2
    if isPrime(odd):
        primes += [odd]
print primes
clock -= time()
print "\n", -clock
raw_input()
I might wanna rewrite the whole script to use a sieve like the Sieve of Atkin: http://en.wikipedia.org/wiki/Sieve_of_Atkin
However, I am simply a beginner at Python (or even at programming: I started writing code only 2 weeks ago) and it would be quite a challenge for me to figure out how to code a Sieve of Atkin algorithm in Python.
I wish a google hacker out there would hand hold me through stuff like this :(
You could use prime sieve, and with a simple twist:
- Define the first prime 2 as you do, set the largest number reached (max) to 2;
- Generate a list of nconsecutive numbers frommax+1tomax+n;
- Use sieve with the primes on this list. When sieving, set the beginning number for each prime to the smallest number in the list that could be divided by the prime;
- If the amount is not reacher, goto 2.
This way, you could control the length of the list, and as the length grows larger, the speed will be faster. However, this is a total rework of the algorithm, and is harder to program.
Here's a sample code, which is quite crude, but this only takes less than 70% time of the original:
from math import sqrt
from time import time
primes = [2]
max = 3
n = input("Number of prime numbers to print: ")
r=2
clock = time()
def sieve(r):
    global primes
    global max
    s = set(range(max,max+r))
    for i in primes:
        b=max//i
        if (b*i<max):
            b=b+1
        b=b*i
        while b<=max+r-1:
            if b in s:
                s.remove(b)
            b=b+i
    for i in s:
        primes.append(i)
while len(primes) < n:
    r=primes[-1]
    sieve(r)
    max=max+r
primes=primes[0:n]
print primes
clock -= time()
print "\n", -clock
raw_input()
There are many ways to improve this, this just shows the notion of the approach.
Also, this can blow up the memory when the number is large. I used the dynamic limit try to somewhat relieve this.
And if you are really curious (and fearless), you could look at the more complicated implementations in various open source projects. One example is Pari/GP, which is written in C++, and is blazing fast (I tested 1 to 50000000 in less than 1 min, if I remember correctly). Translating them to Python may be hard, but will be helpful, perhaps not just for yourself;-)
One simple optimizations which could be applied without hacking the code completely.
- the i*i on every prime gets very wasteful as the list gets longer. Instead calculate the square root of i outside the loop and test against this value inside the loop.
However square root is itself and expensive calculation and the majority of candidate numbers will be rejected as divisible by one of the lower primes (3,5,7) so this turns out to be not such a good optimization (pessimization?). But we don't actually need to be that precise and a simple check that the prime is less than one third of the value has a similar effect without the computational cost of the square root calculation, but, at the expense of a relatively few unnecessary test.
As was already said by Ziyao Wei I'd also try a Sieve implementation. The only thing I'd improve is to use the Prime number theorem as a starting point for the used size.
Computing the inverse function isn't straightforward in pure python, but an iterative approach should be good enough and that way you could get a pretty good idea how large the sieve would have to be. Since I don't really remember the proofs for the theorem in detail and it's 6am in the morning here, someone else will have to chip in to say if the theorem guarantees any certain upper boundary that could be used to allow using the simple sieve without having to worry about growing it. iirc that's sadly not the case.
As already mentioned, the presented algorithm cannot be improved significantly. If a fast solution is requested then the Eratosthenes sieve is appropriate. The size x of the sieve can be estimated using n >= x/(ln x + 2) if x >= 55. This equation can be solved using the Newton's iteration. The presented algorithm is about 10 times faster the original: 
def sieveSize(n):
    # computes x such that pi(x) >= n (assumes x >= 55)
    x = 1.5 * n # start
    y = x - n * math.log(x) - 2 * n
    while abs(y) > 0.1:
        derivative = 1 - n/x
        x = x - y / derivative
        y = x - n * math.log(x) - 2 * n
    return int(x) + 1
def eratosthenes(n):
    # create a string flags: flags[i]=='1' iff i prime  
    size = sieveSize(n)
    flags = ['1'] * size # start with: all numbers are prime
    flags[0] = flags[1] = '0' # 0 and 1 are not primes
    i = 0
    while i * i < size:
        if flags[i] == '1':
            for j in range(i * i, size, i):
                flags[j] = '0'
        i += 1
    return flags
def primes(n):
    flags = eratosthenes(n)
    prims = []
    for i in range(0, len(flags)):
        if flags[i] == '1':
            prims.append(i)
    return prims
prims = primes(100000)
Any number that ends in 5, other than 5, is not a prime. So you can put a statement that skips any number ending in 5 that is greater than 5.
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论