In a previous post, we saw a program that generates prime numbers1 in what seemed to be a rather convoluted manner. In that post, I asserted that the program was hard to understand partly due to lack of documentation on the underlying algorithm. In this post, we will explore in detail that algorithm and how we can better model the underlying core concepts to yield a program that’s easier to grasp with minimal documentation.
But first, let’s explore a simple yet powerful algorithm that represents the foundation of pretty much all algorithms to generate primes.
The Sieve of Eratosthenes
The sieve of Eratosthenes is an old algorithm (attributed to the Greek mathematician of the same name) which generates all primes up to 2. Roughly speaking, it works as follows:
- Mark as the first known prime.
- Create a list of prime candidates, initialized with all positive numbers up to .
- Take the next known prime which hasn’t yet been considered and “cross out” all of its multiples from the list of candidates.
- Mark the next “uncrossed” number in the candidates list as a known prime.
- Go to step 3 unless we’ve reached or gone beyond .
- Report all “uncrossed” numbers as primes.
If we had to summarize how the sieve works in a nutshell, this pithy phrase might capture its essence:
is prime by definition. Assume all other numbers to be prime, until they are proven not to be prime by virtue of being a multiple of the known primes.
Visually, this is how the algorithm operates:
Here is a very simple (i.e., unoptimized) implementation of this algorithm in Python:
def generate_primes_upto(n):
is_prime = [True] * (n + 1)
for candidate in range(2, n + 1):
if is_prime[candidate]:
cross_out_multiples_of(candidate, is_prime)
return collect_primes(is_prime)
def cross_out_multiples_of(prime, is_prime):
n = len(is_prime)
multiple = prime + prime
while multiple < n:
is_prime[multiple] = False
multiple += prime
def collect_primes(is_prime):
primes = []
for number in range(2, len(is_prime)):
if is_prime[number]:
primes.append(number)
return primes
A run of this algorithm with shows us that everything seems to be working as expected:3
$ generate_primes_upto(31)
>> [2, 3, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31]
In this program, the first step is implicitly implemented by initializing all elements in the is_prime
array to True
. The third step is then carried out by the cross_out_multiples_of
function.
To derive the time complexity of this algorithm, notice that the outer loop in generate_primes_upto
traverses each odd number , but the inner loop is only executed when is prime, crossing out multiples. The resulting runtime can be expressed as follows:
This last result comes straight from Euler’s proof that the sum of the reciprocals of the primes diverges, so the overall running time is , which is surprisingly fast! Unfortunately, this running time is achieved at the expense of using space, which can get very costly for large enough 4.
Update: In a previous version of this post, we claimed that the algorithm we’re about to describe had time complexity but there was an error in the derivation. The error has since been corrected to show the correct complexity:
Using the core ideas in this algorithm, it is possible to derive another one which only uses space and generates primes incrementally, allowing us to produce a stream of primes on demand. However, as we’ll see later in this post, the algorithm has time complexity, which is significantly slower than for the basic sieve of Eratosthenes.
Before we get to that, let’s see some simple optimizations that we can apply to this algorithm which will also come in handy later on.
Optimizations
There are at least two basic optimizations we can apply to speed up the simplest version of the sieve of Eratosthenes:
- We can skip all even numbers since—apart from —all primes are odd.
- We can stop discarding multiples after reaching , since every multiple of the form (where are primes less than or equal to ) will have already been crossed out by such smaller primes in earlier iterations.
Applying these optimizations results in updates to the generate_primes_upto
and collect_primes
functions:
def generate_primes_upto(n):
is_prime = [True] * (n + 1)
limit = math.floor(math.sqrt(n))
for candidate in range(3, limit + 1, 2):
if is_prime[candidate]:
cross_out_multiples_of(candidate, is_prime)
return collect_primes(primes)
def collect_primes(is_prime):
primes = [2]
for number in range(3, len(is_prime), 2):
if is_prime[number]:
primes.append(number)
return primes
We can run a quick “smoke” test to ensure we didn’t totally break the algorithm:
$ generate_primes_upto(31)
>> [2, 3, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31]
Notice that despite these optimizations, the time complexity hasn’t improved dramatically and remains linear—as opposed to as we might naively imagine—due to the final sweep we do in collect_primes
, but we’ve removed a lot of redundant computation. This is not reflected in the asymptotic notation, but can be significant in the time taken to run in a real computer. In fact, if you run this modified version, you should notice a measurable performance gain of about 50%, which can be an important difference for many applications.
We can make a few more optimizations by carefully observing how the algorithm works, looking for further redundant computation. Let’s work through an example to see this:
Assume , so that ( here represents limit
in the code.) The list of candidates to consider is initially:
[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
The first prime is, by definition, . By removing its multiples we get:
[3, _, 5, _, 7, _, 9, __, 11, __, 13, __, 15, __, 17, __, 19, __, 21, __, 23, __, 25]
The next prime (i.e., the next “uncrossed” number) is therefore . If we discard its multiples, we get:
[_, _, 5, _, 7, _, _, __, 11, __, 13, __, __, __, 17, __, 19, __, __, __, 23, __, 25]
The next prime is (and the last number we’ll test, as we’ve reached ). Notice that there is no point in trying to discard multiples of like , or , since they were already discarded in previous rounds by smaller primes (namely, and ), so only is left to be discarded:
[_, _, _, _, 7, _, _, __, 11, __, 13, __, __, __, 17, __, 19, __, __, __, 23, __, __]
In general, for a prime we can skip all of its multiples below as we know for sure that those will have been “crossed out” by smaller primes.
Notice also that when discarding multiples, we’re wasting half of the time considering even numbers. For example, assuming was much bigger and we were to consider numbers above , the list of multiples we’d be crossing out includes even numbers interleaved with odd numbers:
[*30*, 35, *40*, 45, *50*, 55, *60*, ...]
This sequence is an instance of the general sequence (with ) which includes both even and odd numbers.
To skip the even numbers, observe that:
is odd because
is odd because
So the sequence needs to change slightly as follows:
This changes the cross_out_multiples_of
function as follows:
def cross_out_multiples_of(prime, is_prime):
n = len(is_prime)
multiple = prime**2
while multiple < n:
is_prime[multiple] = False
multiple += 2*prime
Primes Ad Infinitum
What if we wanted to generate primes without bound? Can we use ideas from the sieve of Eratosthenes to produce a stream (i.e., an incremental list) of primes using much less space than ?
In the sieve of Eratosthenes, we discard multiples of a newly discovered prime all at once, and then forget (for the purpose of crossing out multiples) about that prime forever. However, in an incremental, lazy algorithm where we do just the minimum amount of work necessary to produce the next prime, we need to keep around a “subset of prime divisors” to discard larger multiples when the next prime is requested.
The good news is that to “pull” prime we only need to keep around primes less than or equal to , since any primes above that would only serve to discard larger integers. And, according to the prime counting function, there are primes smaller than , so the storage requirement is quite reasonable.
Before sketching out a possible program for this incremental version, let’s review some of the core ideas we’ll need:
Prime Candidates
We need an infinite sequence of candidate primes (the list of all odd numbers is the simplest choice; future refinements might focus on shortlisting this set of candidates.)
Implicit Primality Test
To generate the next prime, we need to test whether is divisible by smaller primes (as per considerations above, we only need to check prime divisors up to ). Even though the sieve of Eratosthenes doesn’t talk explicitly about a divisibility test, its crossing out of prime multiples is equivalent to that, and both arise from the fact that “every integer (except for 0 and 1) is either composite (i.e., the product of smaller primes) or a prime itself.”
Prime Divisors
The “subset of prime divisors” mentioned earlier would, in reality, be a set of streams of prime multiples. For example, if the current (conceptual) subset were , the corresponding data structure would be the set of (infinite) streams:
Testing whether is a prime amounts to verifying that none of the sequences contain . Since we will always be testing increasingly larger numbers , we don’t ever need to “rewind” in any of the streams. Also, since they are ordered sequences, the amount of numbers to check is always finite in each (i.e., there is no point in checking further elements once we find where represents one of the streams.)
To see how the “subset of prime divisors” grows, we need to remember that for a given number , its potential prime divisors are always smaller or equal than . For example, for , all of its divisors greater than () contain smaller primes in their prime factorization (e.g., ), so testing them would be redundant as we’d have figured out that is not prime when testing divisibility by earlier.
The above implies that we only need to include a new “prime divisor” in when we’ve reached a prime candidate such that is prime (and we can find that out easily since we know all primes less than at that point.) Including it earlier would just lead to redundant tests as explained above, but not including it at that point would lead to false positives. For example, consider the initial subset . When we test and don’t include in , we then run the divisibility test and determine that no prime in divides , therefore wrongly concluding that it is prime.
Putting it all together
With a strong grasp of these ideas, we can finally present an implementation that is split into three parts:
- The overall algorithm that runs through the list of candidates and runs a primality test on each, implemented by
PrimeGenerator
. - The “subset of prime divisors” at any given moment, implemented by
PrimeDivisors
. - The stream of multiples for a given prime that supports the implicit divisibility test, implemented by
PrimeMultiples
.
from itertools import count
class PrimeGenerator:
def generate(self):
self.primes = [2, 3]
self.prime_divisors = PrimeDivisors(self.primes)
yield from self.primes
while True:
self.primes.append(self._next_prime())
yield self.primes[-1]
def _next_prime(self):
# step=2 to avoid redundant tests of even numbers
for candidate in count(self.primes[-1] + 2, step=2):
self._add_new_prime_divisor_if_needed(candidate)
if not self.prime_divisors.has_divisor_for(candidate):
return candidate
def _add_new_prime_divisor_if_needed(self, candidate):
m = len(self.prime_divisors) - 1
prime = self.primes[m]
if candidate == prime**2:
self.prime_divisors.add(prime)
class PrimeDivisors:
def __init__(self, first_primes):
# 2 => 4, 6, 8, 10, 12, 14, 16, ...
# 3 => 9, 15, 21, 27, 33, ...
# ...
self.multiples_of_nth_prime = [
PrimeMultiples(prime) for prime in first_primes
]
def add(self, prime):
self.multiples_of_nth_prime.append(PrimeMultiples(prime))
def has_divisor_for(self, prime_candidate):
return any(
self.is_divisible_by_nth_prime(prime_candidate, n)
for n in range(len(self.multiples_of_nth_prime))
)
def is_divisible_by_nth_prime(self, candidate, n):
# candidate % n == 0 is equivalent to implicitly checking
# that remainder == 0 in prime + ... + remainder == candidate
while self.multiples_of_nth_prime[n].head < candidate:
next(self.multiples_of_nth_prime[n])
return self.multiples_of_nth_prime[n].head == candidate
def __len__(self):
return len(self.multiples_of_nth_prime)
class PrimeMultiples:
def __init__(self, prime):
self.prime = prime
# NOTE(optimization): multiples below `prime^2` are covered
# by smaller primes
self.head = self.prime**2
def __iter__(self):
return PrimeMultiples(self.prime)
def __next__(self):
result = self.head
self.head += 2*self.prime
return result
To test this implementation, we will need a couple of helper functions:
def take_upto(limit, iterable):
return list(takewhile(lambda x: x <= limit, iterable))
def generate_primes_upto_incremental(limit):
return take_upto(limit, PrimeGenerator().generate())
We can then verify that it produces the same result as the original sieve:
$ generate_primes_upto_incremental(31)
>> [2, 3, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31]
Space Complexity
Since this algorithm produces primes one at a time, the only storage we need is that of the set of prime divisors. If we compute primes up to , we need to keep around primes no larger than , of which there are approximately according to the prime counting function.
Each prime divisor requires a constant amount of storage (because their underlying sequences are only virtual as you can see in the PrimeMultiples
class.) Therefore, the overall space required is 5
Time Complexity
Analyzing the time complexity is a bit more involved, and a general analysis can quickly get pretty hairy, so we’ll try to focus on the worst case only.
Let’s define as the running time of the algorithm when we want to generate primes up to . In PrimeGenerator._next_prime
we can see that we’re traversing the sequence , and testing the primality of each candidate in turn. If we denote with the running time of each of these tests, then we have:
will depend on whether is prime or composite. When it’s composite, the running time will be determined by its prime factorization. For example, if it has small prime factors, this will be detected early on in PrimeDivisors.has_divisor_for
, short-circuiting the testing by other divisors, resulting in a constant amount of work. The worst case happens when c
is prime, which forces a test by all prime divisors stored up to that point, resulting in the following runtime:
The term represents the time it takes to test divisibility of the candidate prime by each of the “support” primes stored up to that point (which, by construction, will always be ).
Seeing as the computation is done incrementally, the value of is really just the marginal work expressed in the following equation:
Where represents the last odd number (candidate
) for which PrimeDivisors.is_divisible_by_nth_prime
was invoked, and represents the time it would take to do the divisility test if the computation wasn’t incremental. Due to the minor optimizations we described earlier, is:
The term represents the number of items we’d have to cross out without optimizations; the term represents the savings we get when we start crossing out at ; and the division by accounts for all the even terms we skip in the streams of prime multiples.
Plugging this back into :
The quantity could be either when is simply the next odd number after , or it could be the distance between two consecutive primes (remember that the sequence associated with may have stayed at value after many candidate tests because of the short-circuiting behavior in PrimeDivisors.has_divisor_for
.)
In the first case, reduces to . The second case is rather complicated because it represents a prime gap, for which there is no known absolute formula but only conjectures and certain bounds under various assumptions. If we use Cramer’s conjecture, for example, then the gap is .
Plugging all of this back into the inequality for , we get:
Once again, we run into the sum of the reciprocals of the primes which exhibits “ln-ln” growth, so we can reduce as follows:
We can use this information to derive an upper bound for :
Notice this last bound is simply obtained by assuming that every
Thus, we have found that the time complexity for this algorithm is . Notice that given how slowly grows, for most practical purposes, the effective complexity of the algorithm is
Beware, however, that this analysis relies on a conjecture which applies only to the average size of the gap between consecutive primes, and that we have made several worst-case simplifying assumptions in various parts, so the resulting complexity is very likely to be a somewhat loose upper bound.
Empirical Testing
As with any other algorithm, it is a good idea to run some empirical tests and see whether the results approximately match the predicted runtime.
Keep in mind that big-O notation says nothing about the hidden implementation constants that can make a practical difference between two algorithms in practice. We shouldn’t be too surprised if we find an algorithm with complexity running more slowly than one with for small enough values of . After all, the definition of the big-O notation does say that whenever for all (with integer constants and ).
Empirically testing for , we observe that the running times increase by a factor that tracks very closely the expected theoretical runtime ratios between successive order of magnitude increases ():
Theoretical Ratio | Empirical Ratio | ||
---|---|---|---|
Conclusion
We discovered that the usual space/time trade-off in computer science manifests very prominently in the generation of prime numbers. With enough memory available, even a very naive algorithm can easily outperform more sophisticated methods.
We also found that sometimes the runtime analysis of an algorithm (particularly one involving mathematical objects) can be quite difficult, and without powerful mathematical tools at our disposal, the best we can hope for is some loose upper bounds. Quite often, however, that’s all we need to make a decision, as we’re likely to also run empirical tests anyway.
Even though the resulting algorithm had much worse time complexity for generating primes, it did have the advantage of requiring significantly less memory, and of being suitable for applications where getting the next prime quickly is more important than the overall speed of getting primes. Under some strong assumptions related to prime gaps, we concluded that the time to get the next prime in this incremental algorithm was with being the last prime generated.
References
- 📝 The sieve of Eratosthenes
- 📝 How many primes are there?
- 📝 Divergence of the sums of the reciprocals of the primes
- 📝 Cramer’s conjecture on prime gaps
Notes
- A prime number is an integer which has no divisors other than one and itself. They’re the building blocks of all other integers, and have many interesting properties.↩
- Conceptually speaking, the algorithm is capable of producing primes ad infinitum, but a direct implementation is infeasible. In this post we’ll see how to adapt it to accomplish just that.↩
- In an actual implementation, we would have to add tests to cover corner cases, much larger intervals, as well as tests of the properties of the generated sequences. For the purposes of this post, though, some “smoke” tests will suffice.↩
- In this post we explore an algorithm that decreases the amount of memory required as an intellectual exercise, but bear in mind that, given the increasingly cheaper costs of RAM, in practice the right trade-off might be to simply throw more memory at the problem, especially since the running time of the basic sieve is surprisingly hard to beat.↩
- This is not quite true in the implementation shown here, as we do keep all primes generated in the instance variable
PrimeGenerator.primes
. This amounts to space, which is slightly better than but much worse than . If we don’t know in advance (or if we truly need an “infinite” stream of primes), this is unavoidable. However, if we do know in advance, we can tweak the implementation to ensure that no more than items are ever stored inPrimeGenerator.primes
, as that’s all we need to guarantee correct generation of all other primes up to .↩