Mathematics of Sudoku Leads To "Richter Scale" of Puzzle Hardness

The most difficult known Sudoku puzzle has ‘Richter scale’ rating of 3.6, according to a new mathematical description of puzzle hardness. But there could be harder puzzles out there

Emerging Technology from the arXivarchive page

August 6, 2012

The global fascination with Sudoku has led to a sudden interest in the mathematical properties of the puzzle. In the last few months on this blog, we’ve looked at how mathematicians have solved the minimum Sudoku problem and even how they’ve used the mathematics of Sudoku to encrypt images.

Today, we get a different take on Sudoku thanks to the work of Maria Ercsey-Ravasz at Babes-Bolyai University in Romania and Zoltan Toroczkai at the University of Notre Dame in Indiana.

These guys have developed a way to measure the difficulty of a particular Sudoku puzzle and say their “Richter scale” of puzzle difficulty could be applied to a wide range of other games.

First, a little background about Sudoku. This is a number puzzle consisting of a 9 x 9 grid in which some cells contain clues in the form of digits from 1 to 9. The solver’s jobs is to ﬁll in the remaining cells so that each row, column and 3×3 box in the grid contains all nine digits. In addition, each grid can have only one solution.

Sudoku puzzles are generally classified as easy, medium or hard with puzzles having more starting clues generally but not always easier to solve. But quantifying the difficulty mathematically is hard.

Now Ercsey-Ravasz and Toroczkai say they’ve worked out a way to do it using algorithmic complexity theory. They point out that it’s easy to design an algorithm that solves Sudoku by testing every combination of digits to find the one that works. That kind of brute force solution guarantees you an answer but not very quickly.

Instead, algorithm designers look for cleverer ways of finding solutions that exploit the structure and constraints of the problem. These algorithms and their behaviour are are more complex but they get an answer more quickly.

The central point of Ercsey-Ravasz and Toroczkai argument is that because an algorithm reflects the structure of the problem, its behaviour–the twists and turns that it follows through state space–is a good measure of the difficulty of the problem.

To demonstrate this, they tackle the example of Sudoku. Instead of the brute force solution, they design a much more elegant algorithm that exploits the various constraints of the puzzle, such as the fact that each row column and subgrid must contain all the digits from 1 to 9.

In this way, they transform the problem into a type known to complexity theorists as a k-sat problem.

They start by inserting a random set of numbers into the grid and follow the algorithm’s trajectory through state space as it searches for a solution. For a simple problem, that trajectory is simple, as shown in the upper of the two figures at the top of this post.

But all that changes for a difficult problem. Ercsey-Ravasz and Toroczkai test their algorithm against a Sudoku grid so hard that it has its own name: the platinum blond. The result is shown in the bottom half of the figure. It is considerably more complex and takes ten times as long to solve.

Ercsey-Ravasz and Toroczkai say that for difficult problems, the trajectory becomes chaotic before settling on a solution. In fact, the time it takes to escape this chaotic state is a simple measure of difficulty.

On that basis, they create a ‘Richter scale’ of puzzle difficulty based on the escape rate. The scale goes from 1 to 4, with one being the easiest and 4 being ultrahard.

They say this scale correlates surprisingly well with the subjective human ratings with 1 corresponding to easy puzzles, 2 to medium puzzles and 3 to hard puzzles. The platinum blond has a difficulty of 3.5789.

An interesting corollary is that no Sudoku puzzle is known with a difficulty of 4. And the number of clues is not always a good measure of difficulty either. Ercsey-Ravasz and Toroczkai say they tested many puzzles including several with the 17 clues, the minimum number, and a few with 18 clues.

These were all easier to solve than the platinum blond, which has 21 clues. That’s because the hardness of the puzzle depends not only on the number of clues but also on their position as well.

An interesting question now is whether an ultrahard puzzle with a difficulty of 4 actually exists and how it can be found.

More significant than this is that Ercsey-Ravasz and Toroczkai’s method generalises to all k-sat problems of the same class as Sudoku. So the difficulty of these problems can all be classified by similar Richter-type scales.

That leaves just one question–what should the scale of puzzle difficulty be called? The obvious answer is the Ercsey-Ravasz and Toroczkai scale or ERT scale. Any other suggestions in the comments section below.

Ref: arxiv.org/abs/1208.0370: The Chaos Within Sudoku

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.