Sudoku puzzles can trace their lineage from the magic square (carrés magiques) of 18th century mathematical genius Euler, via latin squares, through American and then Japanese versions in the 80s (from which the name arises- roughly translated, it means solitary number), to their current surge in popularity in the UK, fuelled by publication in most of the quality newspapers (having first appeared in late 2004, however, in the Daily Mail).

For a month or so now, a few of my fellow students have been whiling away time in our own mathematical square, plugging away at Suduko grids from brightly coloured Japanese puzzle collections (combined circulation of dedicated Sudoku books in Japan is around 600,000 a month). Since I like to fuel the illusion that I know a thing or two about numbers, but am in fact generally rubbish at puzzles, I avoided attempting any until earlier this week when I spotted a bunch in The Independent- the next thing I knew, I'd lost my Wednesday afternoon.

Of course, I mangled the first one horribly, so sat down to approach the second one in a more systematic fashion. The simplest trick, as described by Sir Norris above, is to find cells which, through a combination of pressures from the row, column and 3*3 block, can only take a single value. After getting bored with pencilling in possibilities, I knocked together a Maple worksheet that generated lists of possible values for each cell. The best case of this brute force approach is when singletons appear- by plugging those back in and repeating the process, I solved the elementary problem from the Indie entirely mechanically. Which, although a triumph for my feeble coding skills, wasn't particularly enlightening with regards to the subtleties of Sudoku. So I fed the most difficult problem in, and was rewarded with not one cell that was completely determined. Now I had a challenge :)

Some advanced Sudoku tactics

Possibly ruining all the fun, and listed in order of desperation- the further you get in this list, the more fiendish the original puzzle.

So, having carried out the entirely routine steps of identifying potential values (either through fancy computing or writing extremely small in pencil), and assuming that wasn't sufficient to solve the grid, we're in a position to apply some logic. The hope here is that through reasoning, we can reduce or eliminate entirely the need for guesswork and backtracking. Of course, a lucky guess will speed things up, so a combination of reason and intuition make for a true sudoku master. These methodical steps seem like an ideal starting point for a full-fledged solver program, should any of you coders fancy a distraction.

Effective singletons

If no singletons have been identified, then the best case is a collection of cells with only two options. If in any given row, column or block there are two such cells with identical options, then their fates are inextricably linked- determine one, and you know the other. Often, this can set up a chain reaction across the board as a whole. But they can also force the value of another cell, as the following example shows.


-------------------------
|(4,5) :    3    :       |
|  2   :         :       |
|      : (4,5,9) : (4,5) |
-------------------------

Here the values 2 and 3 are given, and we've worked out the options for three cells of interest. If we figure out which of (4,5) the top left cell is, then we get the bottom right for free. But we immediately know the value of the (4,5,9) cell, which may help us determine entries elsewhere if we're currently stumped by the (4,5) choice. This is because whichever the top left cell turns out to be, 4 or 5, the bottom right cell will be the other, 5 or 4. So really, no other cell in this block can be a 4 or a 5, so the (4,5,9) entry is in fact the singleton (9).

This works just as well in a row or column, but the block was a bit easier to display in ascii art. Furthermore, larger cases are theoretically possible- cells with the options (3,4,5), (3,4,5), (3,4,5), (3,4,5,6) force the 6 for the last cell; or (1,2), (1,2), (3,4), (3,4), (1,2,3,4,5) forces the 5 in a cell that seemed to have unassailably many options. But such intricate cases require more cells to be just-so, and hence are rarer.

The importance of once-and-only-once

So far, the approach of identifying singletons or effective singletons has relied on the fact that a given number may appear only once in the row/column/block- we generate lists of potential values by eliminating the ones that have been 'used up' by already making an appearance. But we have a two way condition- each value needs to appear precisely once. So instead of looking for cells that can only take a single value, we look for values that can only occupy a single cell. Here's another block to illustrate.


---------------------------
|(4,6) :    3    : (1,7,9) |
|  2   :  (1,4)  : (7,8,9) |
|(7,9) : (4,5,9) :  (4,5)  |
---------------------------

In this complicated looking list of options, we can in fact determine two of the values. The top left cell has to take the value 6, since no other can; and similarly the 8 must lie in the middle of the third column.

Proof by contradiction

Suppose all the singletons,both immediate and effective have been identified, and furthermore any forced choices from the second strategy above have been made. You've scanned all the affected rows, columns and blocks in light of these changes, and have no more such cases to consider. So a guess is needed. What makes a good gambit? Chances are, what you do have are several chains of cells who's fates are linked, as described earlier. A (4,5) choice may be linked to another such choice in a column- but also be connected to a (3,5) in its row. Now, if it turns out to be a 4, then we don't get any information about the (3,5) pair- but if it's a 5, we determine that (3,5) is really a 3. So, you may have some luck in considering one of the cases from a particularly popular pair. Chose one option, and chase it around the grid. If it generates a contradiction somewhere (two 3's in the same block, for instance), then it can't be a valid choice- so you know it's the other one from the original pair. But beware- absence of contradiction is not sound grounds to assume that your choice was correct, unless of course you finish the puzzle in the process of following its implications. It could just be that you haven't reduced enough cells to get the contradiction, rather than there not being a contradiction at all; and then you'll come unstuck later. The validity of a choice is only certain once you've found counterexamples to the validity of all the other choices- this, in simpler fashion, is what we've been doing from the start in dismissing possibilities.

Sudoku Variants

Sudoku seem to be spreading across the puzzle pages of Britain's quality compacts and broadsheets, and inevitably different papers will introduce gimmicks, such as the Daily Mail's Sudoku-X (which uses rectangles and places a condition on diagonals). The Independent seems particularly keen, having tried letters in place of numbers (with one of the rows or columns yielding an actual word), and on Saturdays offering a Super Sudoku version, which ups the ante to 16X16, 4 subgrids (in hex), clearly making it even harder to keep track of a comprehensive list of possible values, either in your head or on paper! They even ran a national tournament (I qualified for regionals but was moving that day. Oh well.) However, the tried and tested 3X3 version will probably emerge as the staple, since the amount of time it consumes can be tweaked to more or less manageable cases, whilst the super is always going to while away a weekend. Sticking with the format but changing the character set, the debutante has pointed out that Sidduroku, substituting hebrew letters for numerals, will 'fuzzle the brain'!

Complexity

Research at the university of Tokyo has shown that solving n-by-n belongs to the NP-complete class, which is (roughly) one where a solution is easily checked for validity but becomes exponentially more difficult to compute as n grows. There are, for instance, 5524751496156892842531225600 different 9-by-9 latin squares possible- but the additional sudoku requirement on the internal 3-by-3 boxes carves that number down considerably. Reminiscent of the eternity puzzle, specifying the position of numbers in the grid can actually make things harder for the solver since it reduces the number of possible solutions. This is partly how it is possible to adjust the difficulty of a sudoku grid.

themanwho has put a solver online which implements some of the above strategies, removing the donkey work and leaving the interesting stuff (if any; it cheerfully ripped through a maximum difficulty problem from the Independant I tried with it.) It also makes my maple version look shambolic. You can find it here.