Homework 11: Sudoku Solver

Due: Friday 16 April 2010 at 9:59:59 am

Under Construction! The game architecture is being completely redone from last year's code. Please read the following only as general guidance. The instructions and descriptions will be changed very soon...

The Sudoku Game

Sudoku is a 9*9 grid-based puzzle in which the goal is to place numbers from 1 to 9 in the grid squares taking into account specific constraints. The 9 * 9 puzzle grid can be seen as divided into 9 sub-grids of size 3*3. The 3 main conditions in the classic version of Sudoku state are that each square in the grid contains a number from 1 to 9 and a number cannot be repeated:

In any line
In any column
In any on the 3*3 sub-puzzles

For more information of the game of Sudoku, visit: http://en.wikipedia.org/wiki/Sudoku

Need to practice your Sudoku-playing skills? There are lots of on-line games. Here's one: http://www.websudoku.com/

A Sudoku puzzle as needed for this assignment may not be well formed and in general may have no solutions or multiple solutions. Your solution must be able to handle these situations!

The assignment

The purpose of this assignment is to build a Java solver that finds all of the different solutions of the game using a constraint-based approach.

In a constraint-based approach, each square is viewed as a variable that can take on multiple values from a given set. A partial (intermediate) solution is one in which at least one variable has more than one possible value. A final solution is one which each variable has exactly one value (all the sets are singletons).

The required solution (built on the basis of the support code provided) should work as follows. At each step, the program will evaluate a partial solution. Initially, this partial solution corresponds to the input Sudoku puzzle (a 9*9 matrix that is partially filled) in which the values of some squares are known and others are unknown.

Modeling Sudoku

To find all solutions for a partial solution, we have to keep track of all the possible values for each square in the grid. Each square is modeled by an interface, ICell.

In general, there are many ways to model a system. Naively, we could simply model the Sudoku board as a NxN square matrix of cells. We could certainly build our system this way and there might even be some advantages to doing so, e.g all out speed if one does a lot of optimization since the underlying processor architecture is optimized for arrays.

But one of the main tenets of modeling is that one's model of a system should should capture and clearly express the salient characteristics of the system. Sudoku is more than a dumb square grid of numbers. Here are some Sudoku-specific features:

An empty board is fully determined by a single value, it's "order", which for a 9x9 board is 3.
The board is viewable in terms of rows, columns and blocks.
- Each row, column and block has exactly order*order cells, e.g. for order = 3, each row, column and block has 9 cells.
- Each board has exactly order*order rows, columns and blocks.
Each cell is a member of exactly one row, one column and one block.
- No rows share the same cell. Likewise, no columns or blocks share cells. This forces the square matrix layout of rows, columns, and orderxorder sub-matrix of blocks if one were to display the rows, columns and blocks where each cell is only shown once.
The rules of Sudoku stipluate that the value in any given cell has very specific constraints:
- The valid values are 1...order*order (inclusive), e.g. for order=2, the allowed values are [1, 2, 3, 4]
- Any valid value in a cell cannot appear in another cell in the row, column or block to which that cell belongs. Conversely, each row, column or block must contain every valid value exactly once.

Notice how rows, columns and blocks are "homomorphic", that is they are exactly the same abstractly and structurally in the above feature descriptions. Nowhere does it say that rows are horizontal, or columns are vertical or that the blocks are arranged in a square sub-matrix -- that's strictly a result of trying to display each cells only once in our visual representation of the board.

What does this all mean? ==> Rows, columns and blocks can all be represented with the same data structure! We will use the interface ICellSet to represent the set of cells corresponding to a particular row, column or block. There is also the implication that each cell "knows" which specific row, column and block to which it belongs, so each ICell has a set (vector) of ICellSets that correspond to the row, column and block to which that cell belongs (in that order).

A "board" is therefore a set of three {{Vector<ICellSet>}}s representing the rows, columns and blocks of the board. Note that any given cell will appear once in the rows, once in the columns and once in the blocks. The data structures that we have created enable us to access the cells in multiple methods that clearly represent the cell relationships in the manner that best models the problem at hand, namely the Sudoku board. We can easily and consistently see the cells in terms of rows or columns or blocks in a manner that is abstractly equivalent, enabling us to easily reuse the same code to process the cells in any direction we choose. Compare this to a standard NxN matrix representation (a collection of rows) which represents only a single cell relationship feature, namely that they are in rows. Luckily, accessing the cells in terms of columns is not too difficult, but accessing them in terms of blocks is a bear!

Solving Sudoku

The fundamental principle of Sudoku is that any given row, column or block must contain all integers from 1 to order*order inclusive, exactly once.

In the beginning, a Sudoku board is defined by particular cells already being solved (i.e. have a single value in them) and all the rest being unsolved (i.e. have all possible values in them). We can think of an unsolved cell as containing all possible guesses for the value at that moment, which at the beginning, is all possible values.

Thus the simplest reduction process is to look at the set of all values from the solved cells in the row, column and block to which a cell belongs and eliminating those values from the current set of value choices for that cell. (Hint: HashSet<T>.removeAll() does exactly what you want!) To reduce a cell set is to reduce every cell in that cell set. To reduce a board is to reduce every cell in any collection of cell sets -- do you need to reduce every collection of cell sets?

Requirement: the reduction process for a cell set and for a board should be able to detect if the cell set or board is

Irreducible - the reduction pass did not change any values.
Solved - all the cells were solved already or became solved
Unsolvable - a cell somewhere has or ends up having no value choices left, i.e. and empty HashSet of values.
Reducible - a change to the number of values for a cell somewhere occurred.

Irreducible Sudoku boards ARE potentially solvable

You should first get a simple reduction pass working. But once that is operational, you can tackle how to deal with irreducible boards. An irreducible board means that, at least To solve an irreducible board,

To find final solutions from a partial solution, one should look for a square with multiple possible values, generate the different partial solutions by specifying that the square contains one of the values (and impose the restrictions specified by the rules of Sudoku on the content of the other squares), and then recurse for each of these possible solutions until no square has more than one possible value. If square becomes empty, then it will not lead to a final solution. If all squares have exactly one value, then we have a final solution.

The support code provided contains a DrJava project and a few simple Junit tests. The code download is coming "Real Soon Now" ...

Supporting Code Details

Programming Tips

Mutating a value outside of an anonymous inner class

Controlling a loop from inside of an anonymous inner class

How to make copies of a board when you don't know which cell is the one you want to guess from

Your assignment is to implement:

Testing!

Extra credit

Incorporate the supplied GeneratePuzzle utility capability into the main Sudoku solve, both in the view and the model, to enable the user to generate and solve any of the 50,000 supplied puzzles.

As any seasoned Sudoku player knows, there are other reduction rubrics that can be applied to more quickly reduce a board. For instance, if an unsolved cell includes a value choice that no other unsolved cell in its cell contains, then the cell must have that value (this can be extended to a sub-set of values spread across multiple cells in a cell set). Can you incorporate additional reduction rubrics in a modular, flexible and extensible fashion?

Submission

Submit via Owlspace a .zip file containing all the files from the support code including those that you modified. Don't forget to add as header to the GameModel class, your names and ids.

Child pages

211hw11_S11