The Ulam sequence is a sequence of positive integers xn, where x1=1, x2=2, and where each xn for n > 2 is defined to be the smallest integer that can be expressed as the sum of two distinct earlier terms in a unique way. The first few terms of the sequence are 1, 2, 3, 4, 6, 8, 11, 13, 16, 18, 26, 28, 36, 38, 47. The third term is 3, because 3=1+2. The fourth term is 4, because although 4 can be expressed in two ways as the sum of two earlier terms (1+3, or 2+2) the sum “2+2” is not a sum of distinct earlier terms. The fifth term is not 5, because 5=1+4=2+3; rather, it is 6, which is 2+4. The sixth term is not 7, because 7=1+6=3+4, but rather 8, which is 2+6. The last two terms shown add up to 38+47=85, which can be expressed as a sum of two earlier terms in a unique way, and this proves that the next term of the sequence exists and is at most 85. It follows that the sequence continues indefinitely.
Bizarrely, it turns out that the Ulam sequence contains a hidden signal that can be revealed using Fourier analysis. In 2015, Stefan Steinerberger pointed out in the paper A hidden signal in the Ulam sequence that there appears to exist a constant α (approximately equal to 2.5714475) with the properties that the values of αxn modulo 2π are very far from random, as shown above. In particular, these values almost all lie between π/2 and 3π/2. Another way to say this is that if x is one of the first 10 million Ulam numbers, then cos(αx) is negative, with the exception of four values (2, 3, 47, and 69). Typically, the presence of a signal like this is a clue that the sequence exhibits some kind of periodic behaviour, which is puzzling because the Ulam sequence does not appear to do this.
Unlike (say) the Fibonacci numbers, the Ulam numbers seem to have the property that they do not become more and more spread out as they get larger. More precisely, it seems to be true empirically that the probability that a large number chosen at random is an Ulam number is about 7.39%.
The set of Ulam words was introduced in 2020 in the paper Ulam Sets in New Settings, by Tej Bade, Kelly Cui, Antoine Labelle, and Deyuan Li (working under the supervision of Noah Kravitz). Rather than being a number, each element of the set is a binary string of zeros and ones, starting with the strings “0” and “1”. The role of adding up earlier terms in the original Ulam sequence is now played by concatenation of earlier strings, shown above in red and blue. For example, the string 0111 can uniquely be expressed as the concatenation of the shorter strings 011 and 1.
A string of 0s and 1s is included as an Ulam word if it can be expressed as the concatenation of two distinct shorter Ulam words in a unique way. For example, the string “01” is included in the set, because it can be formed by concatenating “0” with “1”. The string “10” is also included, for similar reasons. Unlike the original sequence, the strings “01” and “10” can be included in either order, which is why we refer to the Ulam words as a set, not a sequence. The string 0101 is not included in the set, as it is not the concatenation of two distinct shorter strings, and the string 0011 is not included because it is the concatenation of two shorter strings in two different ways, either by joining 001 and 1, or by joining 0 and 011.
From the list of Ulam words, we see that 4 of the 8 possible binary strings of length 3 are Ulam words, and 8 of the 16 possible binary strings of length 4 are Ulam words, which amounts to 50% in each case. If n is larger, the proportion of binary strings of length n that are Ulam words drops to around 20%. The paper by Bade et al. conjectures that this proportion converges to a fixed nonzero percentage as n tends to infinity, which (if true) would be analogous to the apparent behaviour of the original Ulam numbers. However, the recent paper Distributions of Ulam Words up to Length 30 by Paul Adutwum, Hopper Clark, Ro Emerson, Alexandra Sheydvasser, Arseniy Sheydvasser, and Axelle Tougouma makes a different conjecture, namely that the proportion of Ulam words converges to zero proportionally to n to the power -3/10, as shown in the graph above.
The paper by Adutwum et al. also exhibits a precise connection between the set of Ulam words and the discrete Sierpiński triangle. To construct the triangle, we start with Pascal’s triangle, shown in the left of the picture above, and draw a box around each odd number. (Recall that, apart from the flanking rows of 1s, each entry in Pascal’s triangle is the sum of the two entries above it in the previous row.) The rest of the picture shows the next two steps in the procedure: first left-justify the entries in the triangle, and finally, rotate the whole picture anticlockwise by 90 degrees. The discrete Sierpiński triangle is the resulting pattern of boxes, as shown below.
The connection between Pascal’s triangle and the Sierpiński triangle fractal (shown below) was already known, but Adutwum et al. show how to relate these ideas to the set of Ulam words. More precisely, the authors prove that if we plot the set of all points (x, y) with the property that the binary string 111…1000…0 with y ones and (x – y) zeros is an Ulam word, then the result is the discrete Sierpiński triangle with one extra point, (1, 1). For example, the Ulam words 1, 10, 100, 110, 1000, and 1110 correspond to the coordinates (1,1), (2,1), (3,1), (3,2), (4,1), and (4,3). Ignoring the point (1,1), this gives the first three columns of the discrete Sierpiński triangle.
Picture credits and relevant links
The Ulam sequence was discovered by Stanisław Ulam (1909–1984). It appears as sequence A002858 in the On-Line Encyclopedia of Integer Sequences.
It would be easy to find a whole book’s worth of material about the remarkable properties of the Ulam sequence and its generalizations, and this post barely scratches the surface. The three papers mentioned above are a good starting place for further explorations.
The picture of the twin peak distribution comes from the paper by Stefan Steinerberger.
The scatterplot graph, the diagrams of Pascal’s triangle, and the picture of the discrete Sierpiński triangle come from the paper by Paul Adutwum, Hopper Clark, Ro Emerson, Alexandra Sheydvasser, Arseniy Sheydvasser, and Axelle Tougouma.
The picture of the Sierpiński triangle fractal is by Beojan Stanislaus and appears on Wikipedia.
The other graphics are my own work.
Substack management by Buzz & Hum.
Dr Green,
I’ve never before been exposed to this form of mathematics and the results intriguing. Reading along, there’s that pesky Pascal’s triangle yet again. Thanks for the Ulam Overview!
John
I bet this kills at cocktail parties… what a stroke of luck to have a gambling uncle to learn cool tricks from…