One of the earliest algorithms ever invented. Eratosthenes found a quick way to find all the primes up to a given number n.

Start with the numbers 2,3,...,n. Repeatedly remove all the multiples of the first number in the list, except that number. The first time round, that's 2, so strike 4,6,8... Then the first number is 3, so you strike 6,9,12,... (Note that 6 is already out, from the first stage -- that's OK). Stop when the square of the first number in the list is greater than n. That's it. You can code this pretty efficiently with various tricks of the trade, but the principle is the same.

Obviously, we only delete numbers from the list which are not prime (they're all non-trivial multiples of some number). On the other hand, any composite number up to n is necessarily deleted at some point: such a number may be written xy, and either x2<=n or y2<=n (otherwise xy>n). So the list we end up with is exactly the primes up to n.

There are two equations which are an algebraic expression of this sieve:

k+1 is prime iff for all natural numbers u,v:

(1)<&nbksp> <&nbksp>k <> u*v + u + v

Equation (1) is read "k does not equal u times v plus u plus v". The second equation yields only the odd primes:

2*k+1 is prime iff for all natural numbers u,v:

(2)<&nbksp> <&nbksp>k <> 2*u*v + u + v

One thing to notice. k=2*n satisfies equation (1) for some n iff k=n satisfies equation (2). There are similar equations describing primes in terms of b*k+/-c ("+/-" is to be read "plus or minus") for all naturals b, and each one has an accompanying equation 2*b*k+/-c which describes the primes in the same manner. As b increases, however, more and more work must be done with modulo operations to ensure that the numbers are indeed prime, and that all primes are covered.


The mathematics for the above equations are as follows:

A number n+1 is composite for n in the natural numbers iff there exists a k>1 in the natural numbers such that k divides n and k<n.

A number k divides a number n iff there exists an x in the integers such that k*x = n.

So test n with k which are greater than one. If k is in the set of naturals, then use k+1 for tests. Similarly, use x+1 for x in the naturals.

Thus, a number n+1 is composite iff there exist an x and a k in the naturals such that n+1 = (k+1)*(x+1).

The negation of the above is to say that a number n+1 is prime iff for all x and k in the naturals, n+1 != (k+1)*(x+1), which is the same as saying n+1 != k*x+k+x+1, which is the same as n != k*x+k+x. Similar logic applies for 2*n+1.

Here is a pretty clear explanation of this concept.

Goal: Find all primes which are less than 30.

Step 1: List 2-N

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Step 2: Take 2 as prime, and remove all multiples of 2 up to N.

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Step 3: Take 3 as prime, and remove all multiples of 3 up to N.

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 23 24 25 26 27 28 29 30

Step 4: Take 5 as prime, and remove all multiples of 5 up to N.

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 23 24 25 26 27 28 29 30

Since 5 is the largest integer that is smaller than the square root of N, we can stop here. All number that have not been removed are primes. This leaves us with the following set...

{2, 3, 5, 7, 11, 13, 17, 19, 23, 29}

This algorithm is considered good for small values of N (small in computer terms, so up to 10,000,000 should be ok). You can further enhance the algorithm by skipping all multiples between the current prime and its square. So for instance, in this example when we took 5 as prime, we could skip straight to 25, since we can be sure all other multiples between 5 and 25 (10, 15, 20) would have already been removed. This may not seem like much now, but it can save a bunch of time when your range spans millions.

Here is a simple implementation of the Eratosthenes' Sieve in C. It finds all the primes from 2 to PRIME_MAX - 1 and prints them out.

This implementation takes the smallest known prime that hasn't yet been sieved "currprime" and removes all multiples less than PRIME_MAX from the array. Once the square of currprime is greater than or equal to PRIME_MAX, it stops. As noted above, we can start with the square of currprime, since all lesser multiples will have already been removed.

The program uses an array of chars to hold the primeness of each number from 0 to PRIME_MAX - 1. Each char holds 1 if the index is a prime, or zero otherwise. Since only a bit of storage is needed, storing the primeness in a char is wasteful, but simplifies the code considerably.

#include <stdio.h>

#define PRIME_MAX 100

char sieve[PRIME_MAX];

int nextPrime(int lastprime) {
    int i;

    for (i = lastprime + 1; i < PRIME_MAX; i++)
        if (sieve[i] == 1)
            return i;

    return 0;
}

int main() {
    int i, currprime;

    sieve[0] = 0;
    sieve[1] = 0;
    for (i = 2; i < PRIME_MAX; i++)
        sieve[i] = 1;

    currprime = nextPrime(0);
    while ((currprime * currprime) < PRIME_MAX) {

        for (i = (currprime * currprime); i < PRIME_MAX; i += currprime)
            sieve[i] = 0;

        currprime = nextPrime(currprime);
    }

    for (i = 0; i < PRIME_MAX; i++)
        if (sieve[i] == 1)
            printf("%d\n", i);

    return 0;
}

Since I've been learning Python in my spare time over the summer, and seeing how I'm an amateur mathematician and all, I decided the first thing worth coding was this historic algorithm of the greeks. Well, my first surprise was how small and readable the thing was. It uses some built-in functions from Python, but those are easily explained.

So, without further ado, the code:

def mod(a):
   return lambda x: x % a != 0 or x == a;

def sieve(s, n):
   s = filter(mod(s[n]), s)
   if (s[n+1] > s[-1]**0.5): return s;
   else: return sieve(s, n+1)

def start(x):
   s = range(2, x)
   print sieve(s, 0)

The first definition sets up a nice little functional; that is, a function that begets only further functions. This in particular generates a function that tests if a number is coprime with a, so that mod(2) accepts only two and subsequent odd numbers.

The second definition is the meat of the sieve. It runs by the graces of tail recursion. Each recursion filters out from the sieve all the numbers that are not coprime with the next number on the list. The recursion only runs as long as the factors are less the square root (the **0.5 part) of the largest number still in the list, since possible factors are never greater than the square root of a number. This speeds up the algorithm by a negligible amount — the first step (write down a list of primes) is always linear, so other optimisations are kind of moot.

Finally, the last definition seeds the thing with an arbitrarily sized list. Start the thing by calling, say, start(100) or some larger number, if the stack will handle it.

So elegant. So pretty. This almost makes me want to become a computer scientist. Almost, mind you. I'm not giving up my chosen profession yet.

I'm thinking the next trick will be to write a universal Turing machine. That'd be really swift.

Y'know, if you log in, you can write something here, or contact authors directly on the site. Create a New User if you don't already have an account.