Monte-Carlo Tree Search and minimax hybrids

Monte-Carlo Tree Search is a sampling-based search algorithm that has been successfully applied to a variety of games. Monte-Carlo rollouts allow it to take distant consequences of moves into account, giving it a strategic advantage in many domains over traditional depth-limited minimax search with alpha-beta pruning. However, MCTS builds a highly selective tree and can therefore miss crucial moves and fall into traps in tactical situations. Full-width minimax search does not suffer from this weakness. This paper proposes MCTS-minimax hybrids that employ shallow minimax searches within the MCTS framework. The three proposed approaches use minimax in the selection/expansion phase, the rollout phase, and the backpropagation phase of MCTS. Without requiring domain knowledge in the form of evaluation functions, these hybrid algorithms are a first step at combining the strategic strength of MCTS and the tactical strength of minimax. We investigate their effectiveness in the test domains of Connect-4 and Breakthrough.

[1]  Mark H. M. Winands,et al.  Playout Search for Monte-Carlo Tree Search in Multi-player Games , 2011, ACG.

[2]  金子 知適,et al.  Df-pn with Fixed-Depth Search at Frontier Node , 2005 .

[3]  Bart Selman,et al.  Trade-Offs in Sampling-Based Adversarial Planning , 2011, ICAPS.

[4]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[5]  Mark H. M. Winands,et al.  αβ-based play-outs in Monte-Carlo Tree Search , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[6]  J. Neumann,et al.  The Theory of Games and Economic Behaviour , 1944 .

[7]  M. Winands,et al.  Monte-Carlo Tree Search for the Simultaneous Move Game Tron , 2012 .

[8]  Scott D. Goodwin,et al.  Knowledge Generation for Improving Simulations in UCT for General Game Playing , 2008, Australasian Conference on Artificial Intelligence.

[9]  Olivier Teytaud,et al.  Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[10]  Nicolas Jouandeau,et al.  Parallel Nested Monte-Carlo search , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[11]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[12]  Yngvi Björnsson,et al.  Simulation-Based Approach to General Game Playing , 2008, AAAI.

[13]  Mark H. M. Winands,et al.  Nested Monte-Carlo Tree Search for Online Planning in Large MDPs , 2012, ECAI.

[14]  Richard J. Lorentz Experiments with Monte-Carlo Tree Search in the Game of Havannah , 2011, J. Int. Comput. Games Assoc..

[15]  Bart Selman,et al.  On Adversarial Search Spaces and Sampling-Based Planning , 2010, ICAPS.

[16]  Donald E. Knuth,et al.  The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm , 1981, ICALP.

[17]  Hilmar Finnsson,et al.  Simulation-Based General Game Playing , 2012 .

[18]  Bart Selman,et al.  Understanding Sampling Style Adversarial Search Methods , 2010, UAI.

[19]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[20]  H. Jaap van den Herik,et al.  Monte-Carlo Proof-Number Search for Computer Go , 2006, Computers and Games.

[21]  Ryan B. Hayward,et al.  Monte Carlo Tree Search in Hex , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[22]  R. Ramanujan,et al.  On the Behavior of UCT in Synthetic Search Spaces , 2011 .

[23]  Olivier Teytaud,et al.  On the huge benefit of decisive moves in Monte-Carlo Tree Search algorithms , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[24]  H. Jaap van den Herik,et al.  Progressive Strategies for Monte-Carlo Tree Search , 2008 .

[25]  Bart Selman,et al.  Understanding Sampling-based Adversarial Search Methods , 2010, UAI 2010.

[26]  H. Jaap van den Herik,et al.  Proof-Number Search , 1994, Artif. Intell..

[27]  Daisuke Takahashi,et al.  A Shogi Program Based on Monte-Carlo Tree Search , 2010, J. Int. Comput. Games Assoc..

[28]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[29]  Mark H. M. Winands,et al.  Monte-Carlo Tree Search Solver , 2008, Computers and Games.

[30]  Y. Björnsson,et al.  Game-Tree Properties and MCTS Performance , 2011 .

[31]  Levente Kocsis,et al.  Transpositions and move groups in Monte Carlo tree search , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[32]  James E. Clune,et al.  Heuristic Evaluation Functions for General Game Playing , 2007, KI - Künstliche Intelligenz.

[33]  Tristan Cazenave,et al.  Utilisation de la recherche arborescente Monte-Carlo au Hex , 2009, Rev. d'Intelligence Artif..

[34]  Mesut Kirci,et al.  Feature learning using state differences , 2010 .

[35]  Tristan Cazenave,et al.  Nested Monte-Carlo Search , 2009, IJCAI.

[36]  Jos W. H. M. Uiterwijk,et al.  Combining Proof-Number Search with Alpha-Beta Search , 2001 .

[37]  Tristan Cazenave,et al.  Score Bounded Monte-Carlo Tree Search , 2010, Computers and Games.

[38]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[39]  Mark H. M. Winands,et al.  Enhancements for Multi-Player Monte-Carlo Tree Search , 2010, Computers and Games.

[40]  L. Victor Allis,et al.  A Knowledge-Based Approach of Connect-Four , 1988, J. Int. Comput. Games Assoc..