Memory Bounded Monte Carlo Tree Search

Monte Carlo Tree Search (MCTS) is an effective decision making algorithm that often works well without domain knowledge, finding an increasing application in commercial mobile and video games. A promising application of MCTS is creating AI opponents for board and card games, where Information Set MCTS (ISMCTS) can provide a challenging opponent and reduces the cost of creating game-specific AI opponents. Most research to date has aimed at improving the quality of decision making by (IS)MCTS, with respect to time usage. Memory usage is also an important constraint in commercial applications, particularly on mobile platforms or when there are many AI agents. This paper presents the first systematic study of memory bounding techniques for (IS)MCTS. (IS)MCTS is well known to be an anytime algorithm. We also introduce an anyspace version of (IS)MCTS which can make effective use of any pre-specified amount of memory. This algorithm has been implemented in a commercial version of the card game Spades downloaded more than 6 million times. We find that for games of imperfect information high quality decisions can be made with rather small memory footprints, making (IS)MCTS an even more attractive algorithm for commercial game implementations.

[1]  Bruno Bouzy,et al.  Monte-Carlo strategies for computer Go , 2006 .

[2]  Peter I. Cowling,et al.  Determinization and information set Monte Carlo Tree Search for the card game Dou Di Zhu , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[3]  Peter I. Cowling,et al.  Integrating Monte Carlo Tree Search with Knowledge-Based Methods to Create Engaging Play in a Commercial Mobile Game , 2013, AIIDE.

[4]  Sam Devlin,et al.  Player Preference and Style in a Leading Mobile Card Game , 2015, IEEE Transactions on Computational Intelligence and AI in Games.

[5]  H. Jaap van den Herik,et al.  Replacement Schemes for Transposition Tables , 1994, J. Int. Comput. Games Assoc..

[6]  Martin Müller,et al.  Fuego—An Open-Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[7]  Stuart J. Russell Efficient Memory-Bounded Search Methods , 1992, ECAI.

[8]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[9]  H. Jaap van den Herik,et al.  Parallel Monte-Carlo Tree Search , 2008, Computers and Games.

[10]  Peter I. Cowling,et al.  Parallelization of Information Set Monte Carlo Tree Search , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[11]  Yngvi Björnsson,et al.  CadiaPlayer: A Simulation-Based General Game Player , 2009, IEEE Transactions on Computational Intelligence and AI in Games.

[12]  P. Cowling,et al.  Determinization in Monte-Carlo Tree Search for the card game , 2011 .

[13]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[14]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[15]  Peter I. Cowling,et al.  Ensemble Determinization in Monte Carlo Tree Search for the Imperfect Information Card Game Magic: The Gathering , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[16]  Peter I. Cowling,et al.  Information Set Monte Carlo Tree Search , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[17]  T. Cazenave,et al.  On the Parallelization of UCT , 2007 .

[18]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[19]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[20]  Richard E. Korf,et al.  Depth-First Iterative-Deepening: An Optimal Admissible Tree Search , 1985, Artif. Intell..

[21]  Alan Fern,et al.  Ensemble Monte-Carlo Planning: An Empirical Study , 2011, ICAPS.

[22]  Alan Fern,et al.  Lower Bounding Klondike Solitaire with Monte-Carlo Planning , 2009, ICAPS.

[23]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[24]  Olivier Teytaud,et al.  Special Issue on Monte Carlo Techniques and Computer Go , 2010, IEEE Trans. Comput. Intell. AI Games.