A Rather Excellent Minesweeper AI: Part II

Posted by

Part II: Change the Game, with Frontier DP

Phew… now onto the diehard part.

It’s time to improve the algorithm so that it can solve very large boundaries.

Let’s pretend for example that our boundary is a straight line. If we backtrack from left to right, we notice that the validity of tiles we add to our stack are not affected by tiles 3 squares before it. Hmm… is this observation useful? Yes. Very much so. 

Instead of backtracking, let’s consider a new strategy for generating all valid configurations. We will create all configurations by going through the tiles left to right, keeping a set of valid arrangements so far, and growing that set by adding tiles one by one and considering the possibility of the new tile being a mine or not. This is essentially the “breadth first” version of the backtracking algorithm. Just build all solutions in parallel instead of recursion on each of them.

 

In the image, we’ve reached the 6th tile and there happen to be 4 solutions. Now it’s time to add the 7th. Notice that the leftmost number we neighbour is the “2” on the 6th column. And so, any mines on or before the 4th column in an arrangement don’t affect the validity of the 7th column.

Because we only care about the mine arrangements on columns 5 & 6, we can actually merge solutions with the same arrangements on those columns. For each merged arrangement, we keep track of how many actual solutions it represents. For each columns before the 5th, we only keep track of how many solutions of which it is a mine, since that’s what we’re ultimately interested in. After all, the explicit arrangements were only there to help us calculate that, and so we can discard parts of the explicit arrangement when we don’t need them anymore. In this example, we’ve merged solutions 3 & 4.

 

Arrangement 3 now contains 2 pseudo-solutions, while arrangement 1 & 2 are singletons. We can now expand the current 3 unique arrangements with column 7 (Note that arrangement 3 moves up to arrangement 5 in the following diagram): 

Now notice that we don’t need arrangement information extending to column #5 anymore, since a mine in column 5 won’t affect the validity of any remaining columns. We then collapse column #5, which in effect merges arrangement 4 & 5:

By repeatedly expanding the existing arrangements when adding new tiles, and collapsing them when they become irrelevant, we keep the number of arrangements we need to consider very small. It is still exponential, but it is exponential in only the small number of relevant tiles at one time, rather than exponential in the number of total tiles. 

With a bit of careful implementation, total #mines in the group can also be incorporated in the calculations, so we can calculate final probabilities correctly. We can keep track of #solutions vs total mine count within an arrangement from the very start, and propagate them accordingly as we add and collapse tiles. This adds an O(#mines) factor to the calculation.

It seems like, we just turned an exponential backtrack into a linear algorithm with a small constant factor of O(2^k), where k is the number of un-collapsed considered tiles, 

With that, we have the basis of the Frontier DP Algorithm.

The Frontier DP Algorithm

In our previous example, we worked with a straight line. But the real minesweeper field has rugged perimeters, or even loops, or other arbitrary planar graph arrangements. But we can generalize the algorithm to that.

Let’s call the unknown tiles we want to evaluate as the candidate tiles (purple). We want to visit every candidate tile using this method. We keep a dynamic working set of tiles (green) for which we will keep all valid arrangements of mines for explicitly. We remove tiles from the working set if and only if there are no more new tiles neighbouring the numbers that neighbour it. We also want to keep the working set small at all times, since the number of explicit combinations is exponential to the size of the set.

To add an unvisited tile, we look at all existing explicit arrangements so far and create the “mine” and “clear” possibilities to form potentially 2 new arrangements.

By choosing the order of visits wisely, we can keep the dynamic set of working tiles small. For example, in a contiguous line, it’s good to visit each square in the order of the line instead of randomly. The set will then be no larger than 5 or 6. Luckily, in minesweeper, the boundary unknown tiles are usually the perimeter of large blocks and naturally lends itself to thin line formations. But even for simple cycles or trees, it’s not trivial to find the best traversal order. Needless to say, many games end up with more complex arrangements.

Choose the order of tiles to add is the next problem.

The Pathwidth Problem

The pursuit of the best visiting order that minimises the size of the working set is actually synonymous to the pathwidth problem. It has been studied extensively. It is … drumroll … NP-Complete. I wish I found this out before spending days trying to come up with a cute fast solution.

Frustratingly, even the low-degree nature of minesweeper graphs and the fact that they are planar won’t lend itself to faster solutions. There are linear solutions for very small pathwidths, but even a field needing only 6 tiles in a set at once creates unfeasably high constants.

My efforts went towards a heuristic that reasonably and quickly developed a path through the unknown tiles which seems to work well enough in all games.

How I approximated the pathwidth problem

 
I used a greedy method (disguised as a broken A*) to do this part. Each number on the boundary was considered a “rule” in a constraint satisfaction problem (CSP), with their neighbouring unknown tiles as the variables, all summing to the number itself. Each node was a set of visited variables, and the destination was the state with all variables visited, and thus solved. We add variables one rule at a time as we explore to help “clump” them together, and we remove variables individually as they get irrelevant. We give heuristic weight to current states with fewer variables, among other factors. The heuristic routinely under-estimates, which is in violation of A* but this is necessary to avoid the exponential time of finding the actual optimal solution to an NP-Complete problem. The result is a sub-optimal but practical greedy method that is initially guided gently by A*.

Next in Part III

Go to part 3 for the fun part: watch the AI in action!

Series Navigation<< A Rather Excellent Minesweeper AIA Rather Excellent Minesweeper AI – Results >>

Leave a Reply

Your email address will not be published. Required fields are marked *