Scalability and Parallelization of Monte-Carlo Tree Search

Monte-Carlo Tree Search is now a well established algorithm, in games and beyond. We analyze its scalability, and in particular its limitations and the implications in terms of parallelization. We focus on our Go program MoGo and our Havannah program Shakti. We use multicore machines and message-passing machines. For both games and on both type of machines we achieve adequate efficiency for the parallel version. However, in spite of promising results in self-play there are situations for which increasing the time per move does not solve anything. Therefore parallelization is not a solution to all our problems. Nonetheless, for problems where the Monte-Carlo part is less biased than in the game of Go, parallelization should be quite efficient, even without shared memory.

[1]  Rolf Drechsler,et al.  Applications of Evolutionary Computing , 2004, Lecture Notes in Computer Science.

[2]  Bruno Bouzy,et al.  Monte-Carlo strategies for computer Go , 2006 .

[3]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[4]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[5]  Neil D. Lawrence,et al.  Missing Data in Kernel PCA , 2006, ECML.

[6]  David Silver,et al.  Combining online and offline knowledge in UCT , 2007, ICML '07.

[7]  T. Cazenave,et al.  On the Parallelization of UCT , 2007 .

[8]  Sylvain Gelly,et al.  Modifications of UCT and sequence-like simulations for Monte-Carlo Go , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[9]  Olivier Teytaud,et al.  Continuous Lunches Are Free Plus the Design of Optimal Optimization Algorithms , 2010, Algorithmica.

[10]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008, Computer.

[11]  H. Jaap van den Herik,et al.  Parallel Monte-Carlo Tree Search , 2008, Computers and Games.

[12]  Scott D. Goodwin,et al.  Knowledge Generation for Improving Simulations in UCT for General Game Playing , 2008, Australasian Conference on Artificial Intelligence.

[13]  Rémi Munos,et al.  Algorithms for Infinitely Many-Armed Bandits , 2008, NIPS.

[14]  H. Jaap van den Herik,et al.  Progressive Strategies for Monte-Carlo Tree Search , 2008 .

[15]  Richard J. Lorentz Amazons Discover Monte-Carlo , 2008, Computers and Games.

[16]  Olivier Teytaud,et al.  On the Parallelization of Monte-Carlo planning , 2008, ICINCO 2008.

[17]  Olivier Teytaud,et al.  Creating an Upper-Confidence-Tree Program for Havannah , 2009, ACG.

[18]  Markus Püschel,et al.  Bandit-based optimization on graphs with application to library performance tuning , 2009, ICML '09.

[19]  Olivier Teytaud,et al.  Grid Coevolution for Adaptive Simulations: Application to the Building of Opening Books in the Game of Go , 2009, EvoWorkshops.

[20]  Tzung-Pei Hong,et al.  The Computational Intelligence of MoGo Revealed in Taiwan's Computer Go Tournaments , 2009, IEEE Transactions on Computational Intelligence and AI in Games.

[21]  Martin Müller,et al.  A Lock-Free Multithreaded Monte-Carlo Tree Search Algorithm , 2009, ACG.

[22]  Olivier Teytaud,et al.  Consistency Modifications for Automatically Tuned Monte-Carlo Tree Search , 2010, LION.