论文信息 - Monte-Carlo Tree Search and minimax hybrids

Monte-Carlo Tree Search and minimax hybrids

Monte-Carlo Tree Search is a sampling-based search algorithm that has been successfully applied to a variety of games. Monte-Carlo rollouts allow it to take distant consequences of moves into account, giving it a strategic advantage in many domains over traditional depth-limited minimax search with alpha-beta pruning. However, MCTS builds a highly selective tree and can therefore miss crucial moves and fall into traps in tactical situations. Full-width minimax search does not suffer from this weakness. This paper proposes MCTS-minimax hybrids that employ shallow minimax searches within the MCTS framework. The three proposed approaches use minimax in the selection/expansion phase, the rollout phase, and the backpropagation phase of MCTS. Without requiring domain knowledge in the form of evaluation functions, these hybrid algorithms are a first step at combining the strategic strength of MCTS and the tactical strength of minimax. We investigate their effectiveness in the test domains of Connect-4 and Breakthrough.

Mark H. M. Winands | Hendrik Baier | M. Winands | Hendrik Baier

[1] Mark H. M. Winands,et al. Playout Search for Monte-Carlo Tree Search in Multi-player Games , 2011, ACG.

[2] 金子知適,et al. Df-pn with Fixed-Depth Search at Frontier Node , 2005 .

[3] Bart Selman,et al. Trade-Offs in Sampling-Based Adversarial Planning , 2011, ICAPS.

[4] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[5] Mark H. M. Winands,et al. αβ-based play-outs in Monte-Carlo Tree Search , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[6] J. Neumann,et al. The Theory of Games and Economic Behaviour , 1944 .

[7] M. Winands,et al. Monte-Carlo Tree Search for the Simultaneous Move Game Tron , 2012 .

[8] Scott D. Goodwin,et al. Knowledge Generation for Improving Simulations in UCT for General Game Playing , 2008, Australasian Conference on Artificial Intelligence.

[9] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[10] Nicolas Jouandeau,et al. Parallel Nested Monte-Carlo search , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[11] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[12] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.

[13] Mark H. M. Winands,et al. Nested Monte-Carlo Tree Search for Online Planning in Large MDPs , 2012, ECAI.

[14] Richard J. Lorentz. Experiments with Monte-Carlo Tree Search in the Game of Havannah , 2011, J. Int. Comput. Games Assoc..

[15] Bart Selman,et al. On Adversarial Search Spaces and Sampling-Based Planning , 2010, ICAPS.

[16] Donald E. Knuth,et al. The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm , 1981, ICALP.

[17] Hilmar Finnsson,et al. Simulation-Based General Game Playing , 2012 .

[18] Bart Selman,et al. Understanding Sampling Style Adversarial Search Methods , 2010, UAI.

[19] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[20] H. Jaap van den Herik,et al. Monte-Carlo Proof-Number Search for Computer Go , 2006, Computers and Games.

[21] Ryan B. Hayward,et al. Monte Carlo Tree Search in Hex , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[22] R. Ramanujan,et al. On the Behavior of UCT in Synthetic Search Spaces , 2011 .

[23] Olivier Teytaud,et al. On the huge benefit of decisive moves in Monte-Carlo Tree Search algorithms , 2010, Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games.

[24] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .

[25] Bart Selman,et al. Understanding Sampling-based Adversarial Search Methods , 2010, UAI 2010.

[26] H. Jaap van den Herik,et al. Proof-Number Search , 1994, Artif. Intell..

[27] Daisuke Takahashi,et al. A Shogi Program Based on Monte-Carlo Tree Search , 2010, J. Int. Comput. Games Assoc..

[28] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[29] Mark H. M. Winands,et al. Monte-Carlo Tree Search Solver , 2008, Computers and Games.

[30] Y. Björnsson,et al. Game-Tree Properties and MCTS Performance , 2011 .

[31] Levente Kocsis,et al. Transpositions and move groups in Monte Carlo tree search , 2008, 2008 IEEE Symposium On Computational Intelligence and Games.

[32] James E. Clune,et al. Heuristic Evaluation Functions for General Game Playing , 2007, KI - Künstliche Intelligenz.

[33] Tristan Cazenave,et al. Utilisation de la recherche arborescente Monte-Carlo au Hex , 2009, Rev. d'Intelligence Artif..

[34] Mesut Kirci,et al. Feature learning using state differences , 2010 .

[35] Tristan Cazenave,et al. Nested Monte-Carlo Search , 2009, IJCAI.

[36] Jos W. H. M. Uiterwijk,et al. Combining Proof-Number Search with Alpha-Beta Search , 2001 .

[37] Tristan Cazenave,et al. Score Bounded Monte-Carlo Tree Search , 2010, Computers and Games.

[38] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[39] Mark H. M. Winands,et al. Enhancements for Multi-Player Monte-Carlo Tree Search , 2010, Computers and Games.

[40] L. Victor Allis,et al. A Knowledge-Based Approach of Connect-Four , 1988, J. Int. Comput. Games Assoc..