# The Probabilistic Sudoku Solver (PSS)

This is Michael A. Gottlieb's unorthodox probabilistic sudoku solver page.

Description of The  Probabilistic Sudoku Solver (PSS) Algorithm

PSS is unusual amongst sudoku solvers. It does not work at all like a human being; indeed, it does not use any form of logical deduction or backtracking. You could say that it works by making well-educated guesses. Here is a functional description of the algorithm:

The puzzle state is represented by 729 probabilities, one for each possible digit (1 thru 9) in each possible cell (of the 9x9 grid). Thus, if a character k definitely occupies a cell C then the probability of k in C is 1.0 and the probability of the other 8 characters in C is 0.0; furthermore, the probability of k in all other cells belonging to any of the three groups (the row, column, or region) to which C belongs is 0.0. The game state is initialized by setting probabilities to 1.0 or 0.0 in this manner as dictated by the initially occupied ("clue") cells.  The remaining unassigned probabilities are set so that the total probability of each digit in all cells sums to 9, and the probabilities of the digits in each cell sums to 1.

After the probabilities are initialized, PSS enters a two step loop which repeats until the puzzle is solved (or fails to be solved). I call the two processes in the loop "evolution" and "clipping."

The evolution process adjusts the probabilities incrementally in order to conserve probability within the game state. For probability to be conserved, the sum of the digit probabilities for each cell must equal 1.0, and, for each character k the sum of the probabilities of k within each group must equal 1.0. After initialization (or after clipping), the probabilities will not satisfy these conservation laws (assuming the problem is not yet solved). The evolution process perturbs the probabilities, gently nudging them to converge to a (nearby) state in which the conservation laws are satisfied.

The clipping process is very simple: The highest probability (of all 729) is incremented by .5  (and clipped at 1.0).

When any probability becomes either 0.0 or 1.0 it is locked at that value so PSS can no longer change it. This means that there is no "backtracking." Once PSS decides that a digit goes  (or can not go) in a cell, it never changes it's mind -  there is no trial and error involved in this algorithm.

Performance on Randomly Generated Puzzles

When all the probabilities have been locked (to 0.0 or 1.0) the puzzle is solved. However, PSS is not always able to accomplish this goal.  For example, it will fail on inconsistent puzzles that have no solution. If PSS goes through more than about 100 loop iterations without solving any new cells, it gives up. In general I find that it solves almost all puzzles intended for human entertainment, from very easy through very hard. Considering the simplicity of the algorithm it is rather surprising that PSS does so well. However, PSS is not perfect - it does fail on a significant number of so-called "unsolvable," "fiendish," or "bifurcating" puzzles.

I tested PSS against 3500 randomly generated puzzles rated for difficulty (with much thanks to Michael Kennet for his excellent public domain sudoku program); The first test run consisted of 500 puzzles, the second of 1000, and the third of 2000. Here are the results:

`Worked 500 total puzzles. Solved 488. percent solved = 97.60`
```Worked 9 very easy puzzles. Solved 9. percent solved = 100.00
Worked 90 easy puzzles.      Solved 90. percent solved = 100.00
Worked 227 medium puzzles.    Solved 227. percent solved = 100.00
Worked 148 hard puzzles.      Solved 146. percent solved = 98.65
Worked 26 fiendish puzzles.  Solved 16. percent solved = 61.54```
`Worked 1000 total puzzles. Solved 979. percent solved = 97.90`
```Worked 17 very easy puzzles. Solved 17. percent solved = 100.00
Worked 177 easy puzzles.      Solved 177. percent solved = 100.00
Worked 484 medium puzzles.    Solved 484. percent solved = 100.00
Worked 261 hard puzzles.      Solved 257. percent solved = 98.47
Worked 61 fiendish puzzles.  Solved 44. percent solved = 72.13```
```Worked 2000 total puzzles. Solved 1964. percent solved = 98.20

Worked 17 very easy puzzles. Solved 17. percent solved = 100.00
Worked 389 easy puzzles. Solved 389. percent solved = 100.00
Worked 955 medium puzzles. Solved 955. percent solved = 100.00
Worked 543 hard puzzles. Solved 537. percent solved = 98.90
Worked 96 fiendish puzzles. Solved 66. percent solved = 68.75```

These tests show that PSS solves about 97% - 98% of randomly generated puzzles; it solves all of the very easy, easy, and medium puzzles, all but about 1.3% of the hard puzzles, and 60-70% of the fiendish puzzles.

Performance on Puzzles Considered Extremely Difficult

Of 1106 "unsolvable" puzzles generously donated to this effort by Andrew Stuart of sudoku.ork.uk, PSS solved 573 - about 52%. According to Andrew these puzzles represent 5% of their randomly generated puzzles, which is consistent with the above statistic: that PSS solves about 97% - 98% of randomly generated puzzles.

Many lists of difficult puzzles have been posted at http://magictour.free.fr/sudoku.htm by Guenter Stertenbrink. Amongst them I tested the following:

"topn87" ("87 hardest sudokus sorted by rating , they are hard for the new as well as for the old program. Many new sudokus included from Gordon Royle's list of 18-clues-sudokus")

PSS solves 61 - about 70%

"top95" ("95 hardest sudokus sorted by rating: they are here mainly for historical reasons")

PSS solves 37 - about 39%.

"topn234" ("234 hardest sudokus sorted by rating")

PSS solves 155 - about 66%.

"top1465" ("1465 hardest sudokus sorted by rating")

PSS solves 479 - about 33%.

Performance on Minimal Puzzles

I tested PSS against the first 3690 puzzles in the mammoth list of 36,628 17-clue puzzles posted by Gordon Royle at  http://www.csse.uwa.edu.au/~gordon/sudokumin.php and found that PSS solves 3375 - about 91%.

Performance on Ambiguous Puzzles

I have also tested PSS against many "ambiguous" puzzles, which have multiple solutions. PSS is generally able to solve ambiguous puzzles -- it does well on puzzles with very few clues (from 0 to 16). However I have not tested PSS against enough ambiguous puzzles to draw any conclusions regarding them.

How it works  (details of the algorithm)

The algorithm works in a way very similar to simulated annealing: the evolution process is analogous to annealing - the clipping process introduces noise. More details may follow later. For now, the curious may refer to the source code included in the Visual C project below.

Open Question

How can we characterize the 2%-3% of puzzles that PSS is unable to solve? Why does PSS fail to solve them? Since PSS solves all but the most difficult puzzles, the answer to this question may shed some light on why certain puzzles are so very difficult for people to solve.

PSS PROGRAM FOR WINDOWS/DOS

PSS DOS executable (.exe 52K)

PSS.EXE runs either in DOS or under a DOS command line interpreter in Windows.

PSS reads puzzles from standard input. (The easiest way to use PSS is to put all your puzzles in a file and redirect the file to standard input.) If there is more than one puzzle in the file, PSS  works on each one in the order they appear in the file.

PSS takes one command line argument, a number N which, can be positive or negative. if N is positive it must be between 1 and 81, and it tells PSS how many cells to solve. If N is negative, PSS attempts to solve the entire puzzle and then reports whether or not it succeeded, but without showing you the solution. These features are designed to allow PSS to be used as a "clue generator" to assist human puzzle solvers. If the command line argument is absent,  PSS attempts to solve the entire puzzle, and then reports the solution (or partial solution if it failed).

Here are some sample PSS commands:

To solve all the puzzles in file puzzles.txt:

C:> PSS < puzzles.txt

To solve 10 (additional) cells in each of the puzzles in file puzzles.txt:

C:> PSS 10 < puzzles.txt

To find out whether PSS is able to solve the puzzles in file puzzles.txt  (without seeing any solutions):

C:> PSS -1 < puzzles.txt

PSS Puzzle Format

The puzzle input format for PSS is as follows: All whitespace characters are ignored, digits 1-9 are interpreted as filled cells (clues) and any other character is interpreted as an (initially) empty cell; cells are filled in left to right top to bottom order. A total of 81 non-whitespace characters is required to specify each puzzle

Here is an example of PSS puzzle input that uses underscore "_" for empty cells (white space is included only for human readability):

56_ 1__ ___

__3 5__ 8__

__9 4__ ___

8__ ___ _45

6_5 7_4 3_9

93_ ___ __1

___ __1 4__

__2 __5 7__

___ __8 _96

Project and Source Code

PSS Visual C project (.zip 37K)

This project includes everything you need to build the PSS DOS executable, PSS.EXE. Sample puzzle data is also included.

Source code for the Mathlink version is included too, but to build the Mathlink version you will need the Mathlink include file and library that come with Mathematica. A Mathematica notebook is included which demonstrates how to use PSS under Mathematica (including code that generates animations showing the evolution of the 729 probabilities).

All source code in this project is protected by copyright.

PSS for Mathematica

PSS Mathematica executable (.exe 48K)

This Mathlink-compatible executable allows PSS to run under Mathematica. The Mathlink version of PSS is very handy for observing and analyzing the behavior of the algorithm, generating animations, etc.

Contact Info

Please send comments and questions regarding PSS to:  Michael A. Gottlieb (click for mail form)