A number sequence puzzle is given as a finite sequence of numbers; the challenge is to find the next number in the most obvious extension of the given sequence.

For example, given the puzzle 2, 5, 10, 17, we can see it as the start of the following sequence of numbers: 2, 5, 10, 17, 17, 17, 17, 17, 17, .... But that is "not right": there is no good reason to follow up with more numbers 17. So it's not a good answer. But we can find a simple function that generates every number in the sequence: f(n) = n*n + 1. According to this function, the next number is 26. And if this is the simplest function to describe the given sequence, 26 is the best answer.

If only polynomial functions are used, a completely general method exists to solve these puzzles: Lagrange interpolation. Given any finite sequence of numbers, it always produces the least complex polynomial that fits the sequence; this polynomial is unique and it determines a unique next number. It is not always an integer, but this can be fixed by using a different interpolation formula.

In general, a solution does not have to be a polynomial; it may be some other computable function. For example, you probably want to follow up 0, 1, 0, 1, 0, 1, 0, 1 with 0, but that is not what Lagrange interpolation will produce. So the aim is more generally to find the 'simplest' function that generates another number. To this end, we can define 'simplest' as 'having the lowest Kolmogorov complexity'. The Kolmogorov complexity of a function is the amount of code required to program it; with a Turing machine, to be exact. This doesn't guarantee a unique result, but that's not the point - the point is to give a reasonable measure against which proposed solutions can be compared.

Why use a Turing machine to define complexity, you may ask. The answer is

  • we have to take some machine model, and the Turing machine is pretty standard
  • the theory shows that the choice of machine model doesn't really make much of a difference on the whole, although it matters when comparing individual functions
Still, it is meaningful to ask if the machine model can be 'fine-tuned' to be a closer approximation to the human intuition regarding these puzzles. A machine with random memory access seems more reasonable, for instance.

PS Suvrat makes the point that many sequence puzzles draw on real world knowledge. For example, 3, 3, 5, 4, 4, 3 can be followed by 5 because the sequence represents the string lengths of the English numerals. The approach above implicitly assumes that no such knowledge is required.