Monte Carlo Tree Search in Lines of Action

The success of Monte Carlo tree search (MCTS) in many games, where αβ-based search has failed, naturally raises the question whether Monte Carlo simulations will eventually also outperform traditional game-tree search in game domains where αβ -based search is now successful. The forte of αβ-based search are highly tactical deterministic game domains with a small to moderate branching factor, where efficient yet knowledge-rich evaluation functions can be applied effectively. In this paper, we describe an MCTS-based program for playing the game Lines of Action (LOA), which is a highly tactical slow-progression game exhibiting many of the properties difficult for MCTS. The program uses an improved MCTS variant that allows it to both prove the game-theoretical value of nodes in a search tree and to focus its simulations better using domain knowledge. This results in simulations superior in both handling tactics and ensuring game progression. Using the improved MCTS variant, our program is able to outperform even the world's strongest αβ-based LOA program. This is an important milestone for MCTS because the traditional game-tree search approach has been considered to be the better suited for playing LOA.

[1]  Bruno Bouzy,et al.  Monte-Carlo Go Developments , 2003, ACG.

[2]  Mark H. M. Winands,et al.  ENHANCED REALIZATION PROBABILITY SEARCH , 2008 .

[3]  Hiroyuki Iida,et al.  Application of the killer-tree heuristic and the lambda-search method to lines of action , 2003, Inf. Sci..

[4]  H. J. van den Herik,et al.  Which games will survive , 1990 .

[5]  Jos W. H. M. Uiterwijk,et al.  Analysis and Implementation of Lines of Action Analysis and Implementation of Lines of Action Analysis and Implementation of Lines of Action Analysis and Implementation of Lines of Action , 2000 .

[6]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[7]  M. Winands Informed Search in Complex Games , 2000 .

[8]  Mark H. M. Winands,et al.  Monte-Carlo Tree Search Solver , 2008, Computers and Games.

[9]  Mark H. M. Winands,et al.  Enhancements for Multi-Player Monte-Carlo Tree Search , 2010, Computers and Games.

[10]  Jonathan Schaeffer,et al.  New advances in Alpha-Beta searching , 1996, CSC '96.

[11]  Mark H. M. Winands,et al.  6 x 6 LOA is Solved , 2008 .

[12]  Selim G. Akl,et al.  The principal continuation and the killer heuristic , 1977, ACM '77.

[13]  Mark H. M. Winands,et al.  Evaluation Function Based Monte-Carlo LOA , 2009, ACG.

[14]  H. Jaap van den Herik,et al.  Learning Time Allocation Using Neural Networks , 2000, Computers and Games.

[15]  H. Jaap van den Herik,et al.  Admissibility in opponent-model search , 2003, Inf. Sci..

[16]  Keh-Hsun Chen,et al.  MONTE-CARLO GO TACTIC SEARCH , 2007 .

[17]  Darse Billings,et al.  Search and Knowledge in Lines of Action , 2003, ACG.

[18]  H. Jaap van den Herik,et al.  Enhanced forward pruning , 2005, Inf. Sci..

[19]  Bernd Brügmann Max-Planck Monte Carlo Go , 1993 .

[20]  H. Jaap van den Herik,et al.  Move Ordering Using Neural Networks , 2001, IEA/AIE.

[21]  Tony Marsland,et al.  Selective depth-first game-tree search , 2002 .

[22]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[23]  T. Cazenave,et al.  On the Parallelization of UCT , 2007 .

[24]  H. Jaap van den Herik,et al.  Progressive Strategies for Monte-Carlo Tree Search , 2008 .

[25]  H. Jaap van den Herik,et al.  Parallel Monte-Carlo Tree Search , 2008, Computers and Games.

[26]  H. Jaap van den Herik,et al.  The Quad Heuristic in Lines of Action , 2001, J. Int. Comput. Games Assoc..

[27]  Wolfram Koepf,et al.  Lecture Notes in Computer Science (LNCS) , 2011 .

[28]  Richard J. Lorentz Amazons Discover Monte-Carlo , 2008, Computers and Games.

[29]  Mark H. M. Winands,et al.  MIA: A World Champion LOA Program , 2006 .

[30]  Bruce Abramson,et al.  Expected-Outcome: A General Model of Static Evaluation , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Yngvi Björnsson,et al.  Simulation-Based Approach to General Game Playing , 2008, AAAI.

[32]  David Eppstein Dynamic Connectivity in Digital Images , 1997, Inf. Process. Lett..

[33]  H. Jaap van den Herik,et al.  The Relative History Heuristic , 2004, Computers and Games.

[34]  Y. Björnsson,et al.  Simulation Control in General Game Playing Agents , 2009 .

[35]  H. Jaap van den Herik,et al.  Replacement Schemes and Two-Level Tables , 1996, J. Int. Comput. Games Assoc..

[36]  Y. Bj Risk Management in Game-Tree Pruning , 2002 .

[37]  T. Anthony Marsland,et al.  Risk Management in Game-Tree Pruning , 2000, Inf. Sci..

[38]  Brian Sheppard,et al.  World-championship-caliber Scrabble , 2002, Artif. Intell..

[39]  David Silver,et al.  Combining online and offline knowledge in UCT , 2007, ICML '07.

[40]  Tristan Cazenave,et al.  Utilisation de la recherche arborescente Monte-Carlo au Hex , 2009, Rev. d'Intelligence Artif..

[41]  Ernst A. Heinz Adaptive Null-Move Pruning , 1999, J. Int. Comput. Games Assoc..

[42]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[43]  Olivier Teytaud,et al.  The Parallelization of Monte-Carlo Planning - Parallelization of MC-Planning , 2008, ICINCO-ICSO.

[44]  L. V. Allis,et al.  Searching for solutions in games and artificial intelligence , 1994 .

[45]  Jonathan Schaeffer,et al.  A Gamut of Games , 2001, AI Mag..

[46]  Takashi Chikayama,et al.  Game-tree Search Algorithm based on Realization Probability , 2002, J. Int. Comput. Games Assoc..

[47]  Martin Müller,et al.  A Lock-Free Multithreaded Monte-Carlo Tree Search Algorithm , 2009, ACG.

[48]  Hiroyuki Iida,et al.  A comparative study of solvers in Amazons endgames , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[49]  Mark H. M. Winands 6×6 LOA is Solved , 2008, J. Int. Comput. Games Assoc..