## Collatz sequence generation performance profiling in Clojure (Project Euler Problem 14).

I investigate the performance of several Collatz sequence generation algorithms through profiling Clojure code, inspired by Project Euler Problem 14 (no spoilers as I won't show the solution number in this article, however, if you run the code you will get the solution as the return value). My approach focuses on the number of specific operations of interest rather than clock time (but clock figures are also discussed).

### Motivation

John Conway in 1972 proved that Collatz-type problems can be formally undecidable. Better yet, Paul ErdÅ‘s commented that “mathematics is not yet ready for such problems.” See the Wolfram MathWorld entry for the Collatz Problem for more historical context and extensive bibliography.

### Problem

The Collatz conjecture, proposed by Lothar Collatz, states that all Hailstone sequences as generated by the following piece-wise function eventually reach the number `1`:

So for example, starting with `6`, we obtain the following sequence:

`6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1`

The 14th problem on Project Euler challenges us to find the starting number, under one million, which results in the longest Collatz sequence.

### Solutions and Performance Analysis

Since the Clojure contrib package for profiling was removed, we will use the profiling facilities provided by the wonderful timbre library by Peter Taoussanis; hence all code examples are surrounded as such:

For clock time, we use versions of the code without any profiling macros and rely on the invaluable Criterium library by Hugo Duncan; hence all code examples are surrounded as such:

All benchmarks shown here were done on my machine, an iMac with 2.93 GHz Intel Core i7 and 1333 MHz DDR3 RAM, running x86_64 Mac OS X 10.9.3 with Oracle's JVM 1.8.0_05-b13 and HotSpot 64-Bit Server VM 25.5-b02 mixed mode.

#### A. Brute Force

The most staightforward strategy for solving this problem is to generate the Collatz sequence for each integer starting from `1` up to `limit`, and then find the longest one.

Of course, this strategy is pretty hopeless, which is not a surprise for a problem from Project Euler. The running time is in the order of O(n * C(n)) as we calculate one complete sequence for each integer, and the length of each sequence, C(n), is dependent on its first integer (this function “goes up and down”, hence the name hailstone). The space requirement is in O(n) as we only keep each integer and corresponding sequence length once.

Since the Collatz conjecture is an open problem, and we don't know whether the successor function is defined for every n, we cannot provide an asymptotic time complexity measure. The best we can do for now is to claim that the complexity of C(m) is proportional to Θ(m) for known sequences starting with m in the range `1` to `limit`.

From the profile names, we see that our algorithm has predictably generated one series less than the limit (since we started incrementing from `2`, but has calculated over 131 million hailstone successors! Clearly there is a lot of repetition. On my fairly modern i7 iMac this takes well over five minutes to complete and gobbles around 1 GB of memory, causing over one thousand garbage collector sweeps.

Profiled NameInstance Count
:user/collatz-seq999,998
:user/next-collatz131,434,272

#### B. Simple Memoization

The problem becomes more easily solvable once we notice that every Collatz sequence will eventually contain, as a suffix, a shorter Collatz sequence, and assuming the convergence hypothesis to be true, all such suffix sequences will converge to the integer `1`. Therefore, we will employ a memoized algorithm which generates sequences starting from an integer `N` but only up to an integer `N'` for which a sequence has already been found, and proceed inductively (the base case is the sequence of length `1` which contains `1` itself); the length of the sequence corresponding to the integer `N` is then the length of the sequence up to the integer `N'` plus the length of the stored sequence for `N'`.

We have now constrained the number of sequence generations to the order of O(n) because we will never generate a hailstone more than once, although given that Clojure's lookup time in our memoization map is O(log n), our algorithm is technically still in O(n log n); however, the actual cost for Clojure's lookup operations is so small (O(log32 n)) that this algorithm significantly outperforms the previous brute force one, as expected, and thus we may consider it near O(n). The space requirements have grown, however, since we keep successors in memory, but only slightly as it is within O(n * C(n)).

Once again, the algorithm generated one series less than the limit, as before, but now it has calculated just over two million hailstones. On the same machine, this takes roughly one and half minutes, and 1 GB of memory with only 27 garbage collector sweeps since the memoization does not lose references to its contents.

Profiled NameInstance Count
:user/collatz999,998
:user/collatz-seq999,998
:user/next-collatz2,168,610
Benchmark MetricValue
Execution time sample mean1.087031 min
Execution time mean1.087147 min
Execution time sample std-deviation881.593049 ms
Execution time std-deviation887.078664 ms
Execution time lower quantile1.072142 min ( 2.5%)
Execution time upper quantile1.129129 min (97.5%)

#### C. Sequences-Only Memoization

We can make a trade between calculating successor hailstones and memoizing all of them, by memoizing only entire sequences. Once a sequence reaches a number for which a sequence has previously been generated, its length is calculated by adding the memoized length to the current sequence length.

This time we have calculated over five million hailstone successors (again O(n * C(n)), but kept our space requirements to the order of C(n). Unfortunately, we have now introduced 2 * (C(n) + n) look ups in the memoization (:user/existing-lookup-condition and :user/existing-lookup), thanks to Clojure's implementation of `split-with` which uses `take-while` and `drop-while` resulting in the predicate being called more than once on many hailstones. Still, this algorithm takes roughly 4.1 seconds to complete, and 0.5 GB memory with 139 garbage collector sweeps.

Profiled NameInstance Count
:user/collatz999,998
:user/next-collatz5,226,259
:user/existing-lookup-condition12,452,514
:user/existing-lookup999,998
:user/existing-assoc999,998
Benchmark MetricValue
Execution time sample mean3.989335 sec
Execution time mean3.989318 sec
Execution time sample std-deviation21.328040 ms
Execution time std-deviation22.139850 ms
Execution time lower quantile3.950992 sec ( 2.5%)
Execution time upper quantile4.030324 sec (97.5%)

#### D. Sequences-Only Memoization Without Extra Lookups

We can make yet another trade and avoid the extra lookups by calculating a few more successors. In order to do this, we need to avoid `split-with` from the previous algorithm and replace it with `take-while`; unfortunately, `take-while` does not return the last hailstone as it's the one satisfying the predicate `split`, so we will have to call `next-collatz` again for the last hailstone and thus duplicate a small amount of our work, specifically n additional successor calculations.

We got rid of all but n of the extra lookups, and got a small speed up: this algorithm runs in roughly 3.7 seconds, just 400 milliseconds faster than the previous one (and uses the same memory). Can we get rid of the duplication as well?

Profiled NameInstance Count
:user/collatz999,998
:user/next-collatz6,226,257
:user/existing-lookup-condition6,226,257
:user/existing-lookup999,998
:user/existing-assoc999,998
Benchmark MetricValue
Execution time sample mean3.630457 sec
Execution time mean3.630670 sec
Execution time sample std-deviation20.309249 ms
Execution time std-deviation21.256671 ms
Execution time lower quantile3.606157 sec ( 2.5%)
Execution time upper quantile3.677730 sec (97.5%)

#### E. Sequences-Only Memoization Without Extra Lookups (Another Attempt)

We can refactor the previous solution to avoid the `n` lookups (named :user/existing-lookups) and the duplicated calls to `next-collatz`. In order to do so, we note that Clojure's `take-while` will not yield the value for which the predicate is `false`, which we need since it's the first memoized hailstone. We also don't want to use `split-with` since the lookups in the predicate will be repeated many times, defeating our purpose. Furthermore we can't return more than one value from our successor `collatz-next`.

We can, however, modify this successor to do two things: first, it will now be responsible for making the lookups and returning the memoized length needed by `collatz`, appended to the sequence after the last new hailstone; second, it will employ a sentinel value in the form of a different data type for the last, special sequence entry, just before returning `nil` and terminating `collatz`'s iteration.

We got rid of the extra lookups and the duplication, but this algorithm runs slower! I takes roughly five and a half seconds. Evidently, Clojure's overhead is too large for these values of n.

Profiled NameInstance Count
:user/collatz999,998
:user/next-collatz5,226,259
:user/existing-lookup-condition6,226,257
:user/existing-assoc999,998
Benchmark MetricValue
Execution time sample mean4.821011 sec
Execution time mean4.820981 sec
Execution time sample std-deviation27.390670 ms
Execution time std-deviation27.611627 ms
Execution time lower quantile4.761616 sec ( 2.5%)
Execution time upper quantile4.878438 sec (97.5%)

#### F. Iterative Sequences-Only Memoization

Recall that the length of a sequence is actually what we're looking for: there is no need for our algorithm to calculate the sequence for any value of `n` which appears as a hailstone in a memoized sequence, since the distance of that hailstone from the end of the sequence can be trivially measured. Therefore, the following algorithm 'processes' each sequence and keeps a mapping between each hailstone and its distance from the end. The larger the new sequence, the more lengths for values of `n` we get (in near constant time as we `recur` on the sequence).

We only needed to process roughly four hundred thousand sequences, and got away with skipping over half a million sequences as they involved a value of `n` for which we had already calculated its length, mostly from the distance of its value from the end of a memoized sequence. We're back down to roughly four seconds, but only 0.3 GB of memory required this time.

Profiled NameInstance Count
:user/collatz999,998
:user/next-collatz2,168,610
:user/existing-lookup-last432,363
:user/existing-lookup-condition5,769,581
:user/existing-lookup-sequence2,168,610
:user/existing-assoc2,168,610
:user/existing-skip1,355,037
:user/existing-seen567,635
:user/process-collatz-sequence432,363
Benchmark MetricValue
Execution time sample mean4.002129 sec
Execution time mean4.001909 sec
Execution time sample std-deviation98.338164 ms
Execution time std-deviation99.362992 ms
Execution time lower quantile3.897592 sec ( 2.5%)
Execution time upper quantile4.229077 sec (97.5%)

#### G. Iterative Sequences-Only Memoization with Heuristic Constraint

Do we need to memoize the lengths of all hailstones we encounter? Perhaps we can experiment with a very simple heuristic: only memoize lengths for hailstones less than `limit`; if we are lucky, for the given values of `n` we should more or less stay within the range of `1` to `limit` most of the time, and only miss out on a few sequences which contain hailstones beyond our limit (thus forcing us to calculate segments of sequences more than once instead of finding them in our memoization).

As expected, most of the profiled counts are the same with those of the previous algorithm; after all, we are computing the exact same sequences and memoizing the same length values for each `n`. But crucially, we reduced the memoized values by a half (:user/existing-assoc), at the tiny cost of calculating roughly 11.6% more successors (:user/next-collatz) and corresponding lookups (:user/existing-lookup-condition). This algorithm now runs in just over three seconds and uses only 0.3 GB of memory.

Profiled NameInstance Count
:user/collatz999,998
:user/next-collatz2,355,035
:user/existing-lookup-last432,363
:user/existing-lookup-condition6,142,431
:user/existing-lookup-sequence999,998
:user/existing-assoc999,998
:user/existing-skip1,355,037
:user/existing-seen567,635
:user/process-collatz-sequence432,363
Benchmark MetricValue
Execution time sample mean3.187209 sec
Execution time mean3.187221 sec
Execution time sample std-deviation16.322258 ms
Execution time std-deviation16.469264 ms
Execution time lower quantile3.161468 sec ( 2.5%)
Execution time upper quantile3.218255 sec (97.5%)

### Further Exploration

A discussion of various other attempts at this problem in Clojure, including a native Java integer array, can be found at the Clojure Companion Cube by Ivar Thorson. Although none of those solutions perform better than the ones above, Ivar shares several important lessons learned about performance in Clojure in particular and the JVM in general.

As with all problems, this one is being discussed in the Project Euler forum.

### Conclusion

As Alan Perlis once suggested, “A LISP programmer knows the value of everything, but the cost of nothing.”

Or, as Randall Munroe from xkcd 710 puts it, slightly more insightfully:

### Acknowledgments

Gratitude to Dr Anastasia Niarchou for her feedback.