Part III: Final Results
We now have a blisteringly robust minesweeper solver that *correctly* determines the probability of each tile being a mine for even enormous grids in reasonable time.
Granted, there are artificial cases where the algorithm explodes into exponential runtime, but for nearly all practical gameplay scenarios, the boundary forms nice thin lines and the algorithm is able to slide along them.
Here it is playing smoothly on a comically large grid of 300 x 150 with 10 thousand mines:
Note: even with perfect play, a grid this size will have close to zero wins due to the sheer number of random flips that appear.
And here it is compared to the usual backtracking algorithm on 30×30 grids. Note that on larger grids, the backtracking one tends to get stuck for a very very long time but in 30×30 it usually makes progress:
Here it is wizzing through 2000 expert grids in 80 seconds:
The win rate for the 2007 expert runs above is 52.96%.
IN this article, only algorithmic speed and correctness was considered. There is some work to do on actual winrate. We need to improve the choice of the next tile to click, in moments where no tiles are entirely safe.
Currently, the AI picks the tile with the lowest probability of being a mine, without considering how much information that tile would produce. It also disregards improvement in information that the tile might receive as a result of knowing better the number of tiles remaining. For example, in the example below with 11 total mines (so 3 yet unflagged), all tiles marked “A” have 0.333 probability, while the two tiles marked “B” have 50-50. However, the B group is completely independent and should be hit first, in order to gain possible information about the rest of the mines. Since they need to be hit with 50-50 at some point anyway, might as well do it earlier.
Some ways to solve this might include a full game-tree analysis on all remaining mine possibilities once the number of tiles remaining is small. Another might be to measure the entropy reduction of every action – if we click this square and don’t die, how many combinations are there in each revealed number possibility? The lower the squared sum the better. Finally, it might be possible to train a Neural Network to recognize common shapes of frontiers to determine the “juciest” squares to hit next, and incorporate that to the probability weights. Training data might involve simulations of each possible click in many scenarios and whether they win or not after playing through normally.