Sequential Halving for Partially Observable Games

This paper investigates Sequential Halving as a selection policy in the following four partially observable games: Go Fish, Lost Cities, Phantom Domineering, and Phantom Go. Additionally, H-MCTS is studied, which uses Sequential Halving at the root of the search tree, and UCB elsewhere. Experimental results reveal that H-MCTS performs the best in Go Fish, whereas its performance is on par in Lost Cities and Phantom Domineering. Sequential Halving as a flat Monte-Carlo Search appears to be the stronger technique in Phantom Go.

[1]  R. Munos,et al.  Best Arm Identification in Multi-Armed Bandits , 2010, COLT.

[2]  Tristan Cazenave,et al.  Ieee Transactions on Computational Intelligence and Ai in Games 1 Sequential Halving Applied to Trees , 2022 .

[3]  Bruno Bouzy,et al.  Monte-Carlo Go Developments , 2003, ACG.

[4]  Matthew L. Ginsberg,et al.  GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.

[5]  Paolo Ciancarini,et al.  Monte Carlo tree search in Kriegspiel , 2010, Artif. Intell..

[6]  Oren Somekh,et al.  Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.

[7]  Brian Sheppard,et al.  World-championship-caliber Scrabble , 2002, Artif. Intell..

[8]  Mark H. M. Winands,et al.  Monte Carlo Tree Search in Lines of Action , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[9]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[10]  Tristan Cazenave,et al.  A Phantom-Go Program , 2006, ACG.

[11]  Shang-Rong Tsai,et al.  Current Frontiers in Computer Go , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[12]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[13]  Mark H. M. Winands,et al.  Real-Time Monte Carlo Tree Search in Ms Pac-Man , 2014, IEEE Transactions on Computational Intelligence and AI in Games.

[14]  Mark H. M. Winands,et al.  Minimizing Simple and Cumulative Regret in Monte-Carlo Tree Search , 2014, CGW@ECAI.

[15]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[16]  David Tolpin,et al.  MCTS Based on Simple Regret , 2012, AAAI.

[17]  Peter I. Cowling,et al.  Information Set Monte Carlo Tree Search , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[18]  Rémi Munos,et al.  Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..

[19]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[20]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[21]  Alan Fern,et al.  UCT for Tactical Assault Planning in Real-Time Strategy Games , 2009, IJCAI.

[22]  Carmel Domshlak,et al.  Simple Regret Optimization in Online Planning for Markov Decision Processes , 2012, J. Artif. Intell. Res..

[23]  Mark H. M. Winands,et al.  Monte Carlo Tree Search for the Hide-and-Seek Game Scotland Yard , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[24]  Ryan B. Hayward,et al.  Monte Carlo Tree Search in Hex , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[25]  Peter I. Cowling,et al.  Monte Carlo Tree Search with macro-actions and heuristic route planning for the Physical Travelling Salesman Problem , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).