Our Sudoku puzzle generator

Generating Sudoku puzzles is a moderately tricky process, something that computers make much easier of course. The process consists of two steps, generating the base grid and then erasing the grid.

For a regular Sudoku the simple way to generate a puzzle is to “seed” the puzzle, filling in the upper left corner with a single random digit, 1-9. Repeat this process until you have the digits 1-9 in some order across the top row.

What you do next depends on the type of puzzle and how much of a programmer you are. A standard method used for solving Sudokus and other logic puzzles is the “Dancing Links” algorithm. This is a fairly efficient method of solving Sudokus, but doesn’t work “out of the box” for all puzzles. For example, Stars & Stripes Sudoku doesn’t work directly because Dancing Links normally wants to eliminate all conflicts – that is, if you assume there’s a “*” here in a row it would remove all other “*” possibilities. Likewise, the fault line in an Earthquake puzzle doesn’t work at all with any of the standard solving techniques.

The simplest method to continue generating the initial grid is the brute force approach. In this method, after filling in the top row you would start filling in the second row by taking a random digit and trying to place it in the first space of the second row. If it’s the same as any of the three numbers already in the box, try again with a different digit. Repeat this process, moving right and then down until you’ve filled in the whole puzzle. Each time you fill in a space if you realize a digit can’t go there, erase it from the list of digits and try again. If at any point there are no digits left to fill in, go back to the previous space and try a different digit.

This is an incredibly slow technique.

An improved version of this keeps track of the possible digits that can go into a cell. Each cell also “knows” who its neighbors are, the cells that are in the same row, column and box. Each time you try to put a digit in a cell, you go to each neighboring cell and “cross off” the digit from that cell. If doing so would leave a cell where no digits can go then you go back to the original cell (undoing all the crossing off you’ve done) and mark that the digit cannot go there. If at any point while erasing, a cell is left with only one digit that can go there, that’s the same thing as assuming the digit is correct and so you need to visit its neighbors too, crossing things off and checking. If you manage to visit all the neighbors without causing a contradiction, then the digit you have filled in is a possibility, however you must keep track of what’s been crossed off so you can restore it – always in reverse order, by the way – if down the line a contradiction is found.

An mildly improved version of this also keeps track of how many possibilities are left for each space and favors those with fewest possibilities: that is, if one cell has four different digits that could be filled in there and another cell has six, you try the one with four possibilities first.

In many ways this technique is a variation of the dancing links algorithm, just without the links and the dancing. Both techniques work by making an assumption and following it through until it eithers fails immediately or doesn’t lead to a contradiction, at least not immediately, making this a “backtracking” algorithm.

Once a filled in puzzle has been created, the next step is to erase digits from it. The trick here is to ensure that the puzzle remains solvable with a single solution. That is, whatever technique is used to solve the puzzle must ensure that even after one solution is found, you keep hunting for additional solutions.

There are two basic ways to go from here, erasing and revealing. To a certain extent this is based on whether you’re trying to make an easy puzzle or a hard one. To make an easy one, the best way is to erase. That is, take the puzzle and erase until it has the number of givens you want. Let’s say you want to make a 30 given puzzle, you’ll end up erasing 51 of the 81 spaces. So pick a random space, erase the cell there and then apply your Sudoku solver to the puzzle. If it comes up with a single solution then you can go ahead and erase another digit. If erasing a digit makes the puzzle have two or more solutions, mark that space as one not to erase and go erase a different one. Repeat until you have the number of empty spaces you want – or you’ve run out of spaces to erase. If you wish, you can always backtrack and fill in the previous space you erased and erase a different one.

The revealing method starts off with an empty grid where you reveal a starting number of spaces and apply your solver to the puzzle. If it finds multiple solutions, reveal another space and keep doing this until the puzzle has a single solution. This is a great way to make hard puzzles but is also fairly slow since solving puzzles with small numbers of givens is the hardest thing to do. You also need to ensure that once your solver finds more than one solution since when working up revealing spaces there will often be many, many possible solutions. A useful trick for determining which space to reveal is to compare the two solutions your solver came up with. Pick a random spot where the digits are different between the two puzzles. This is one place where you know revealing a digit will make the puzzle less ambiguous.