research gate google scholar linkedin email

Can Monte-Carlo Tree Search learn to sacrifice?

Monte-Carlo tree search is a powerful game playing algorithm which uses stochastic methods to evaluate game states. It has performed extremely well in some games, but has proven less useful in others, notably chess. This deficiency prompted us to investigate the efficacy of MCTS in finding sacrifice moves, moves which require some resource to be deliberately given up in order to open up a good sequence of moves.

Sacrifice in Ultimate-Tic-Tac-Toe

Sacrifice moves are common in many games. It refers to a bad move, which gives up some resource, in order to create a game state in which the loss is recovered with a particular sequence of moves that leads to an advantage over the opponent. For example, in chess, one might sacrifice a piece to gain a positional advantage. In Go, one might put a stone in a position where it cannot be saved, but its presence forces your opponent to play safely so as to prevent the stone becoming useful.

Move 1 Move 2Move 3

An example of a sacrifice sequence can be seen in the figure above. At the beginning of the sequence, it is the turn of X to move in the lower left box. The move being made is shown in bold. In order to understand the sequence of moves, check which boxes have already been won at the start of the scenario, and then track which boxes are won over the course of the scenario.

In order to get a guaranteed win, X must give up the lower right box. If, after giving up the lower right box, X did not play the winning move, then O would be closer to winning the entire game, since O would have already won the lower right box. Thus, immediately after O wins the lower right box, X's position is worse than if she had not sent O to the lower right. However, if X chooses the correct move and wins the game, then the sacrifice move is revealed as optimal.

download Download source code

To run the code, change the range of "i" in line 46 to determine which enhancements to use (explained in the code comments)

To change the scenarios, change the range of "setting" in line 86.

by Nathan Companez