Transpositions and move groups in Monte Carlo tree search

Monte Carlo search, and specifically the UCT (Upper Confidence Bounds applied to Trees) algorithm, has contributed to a significant improvement in the game of Go and has received considerable attention in other applications. This article investigates two enhancements to the UCT algorithm. First, we consider the possible adjustments to UCT when the search tree is treated as a graph (and information amongst transpositions are shared). The second modification introduces move groupings, which may reduce the effective branching factor. Experiments with both enhancements were performed using artificial trees and in the game of Go. From the experimental results we conclude that both exploiting the graph structure and grouping moves may contribute to an increase in the playing strength of game programs using UCT.

[1]  Jahn-Takeshi Saito,et al.  Grouping Nodes for Monte-Carlo Tree Search , 2007 .

[2]  Julien Kloetzer,et al.  The Monte-Carlo Approach in Amazons , 2007 .

[3]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[4]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[5]  Bernd Brügmann Max-Planck Monte Carlo Go , 1993 .

[6]  Bruno Bouzy,et al.  Monte-Carlo Go Reinforcement Learning Experiments , 2006, 2006 IEEE Symposium on Computational Intelligence and Games.

[7]  Rémi Munos,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[8]  Martin Müller,et al.  Computer Go , 2002, Artif. Intell..

[9]  H. Jaap van den Herik,et al.  A solution to the GHI problem for best-first search , 2001, Theor. Comput. Sci..

[10]  Bruno Bouzy,et al.  Monte-Carlo Go Developments , 2003, ACG.

[11]  Akihiro Kishimoto,et al.  A General Solution to the Graph History Interaction Problem , 2004, AAAI.

[12]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[13]  Yngvi Björnsson,et al.  Simulation-Based Approach to General Game Playing , 2008, AAAI.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Dana S. Nau,et al.  An Analysis of Forward Pruning , 1994, AAAI.

[16]  H. Jaap van den Herik,et al.  Replacement Schemes for Transposition Tables , 1994, J. Int. Comput. Games Assoc..

[17]  David Silver,et al.  Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Achieving Master Level Play in 9 × 9 Computer Go , 2022 .