论文信息 - Sequential Halving for Partially Observable Games

Sequential Halving for Partially Observable Games

This paper investigates Sequential Halving as a selection policy in the following four partially observable games: Go Fish, Lost Cities, Phantom Domineering, and Phantom Go. Additionally, H-MCTS is studied, which uses Sequential Halving at the root of the search tree, and UCB elsewhere. Experimental results reveal that H-MCTS performs the best in Go Fish, whereas its performance is on par in Lost Cities and Phantom Domineering. Sequential Halving as a flat Monte-Carlo Search appears to be the stronger technique in Phantom Go.

[1] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.

[2] Tristan Cazenave,et al. Ieee Transactions on Computational Intelligence and Ai in Games 1 Sequential Halving Applied to Trees , 2022 .

[3] Bruno Bouzy,et al. Monte-Carlo Go Developments , 2003, ACG.

[4] Matthew L. Ginsberg,et al. GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.

[5] Paolo Ciancarini,et al. Monte Carlo tree search in Kriegspiel , 2010, Artif. Intell..

[6] Oren Somekh,et al. Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.

[7] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..

[8] Mark H. M. Winands,et al. Monte Carlo Tree Search in Lines of Action , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[9] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[10] Tristan Cazenave,et al. A Phantom-Go Program , 2006, ACG.

[11] Shang-Rong Tsai,et al. Current Frontiers in Computer Go , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[12] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[13] Mark H. M. Winands,et al. Real-Time Monte Carlo Tree Search in Ms Pac-Man , 2014, IEEE Transactions on Computational Intelligence and AI in Games.

[14] Mark H. M. Winands,et al. Minimizing Simple and Cumulative Regret in Monte-Carlo Tree Search , 2014, CGW@ECAI.

[15] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[16] David Tolpin,et al. MCTS Based on Simple Regret , 2012, AAAI.

[17] Peter I. Cowling,et al. Information Set Monte Carlo Tree Search , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[18] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..

[19] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[20] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[21] Alan Fern,et al. UCT for Tactical Assault Planning in Real-Time Strategy Games , 2009, IJCAI.

[22] Carmel Domshlak,et al. Simple Regret Optimization in Online Planning for Markov Decision Processes , 2012, J. Artif. Intell. Res..

[23] Mark H. M. Winands,et al. Monte Carlo Tree Search for the Hide-and-Seek Game Scotland Yard , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[24] Ryan B. Hayward,et al. Monte Carlo Tree Search in Hex , 2010, IEEE Transactions on Computational Intelligence and AI in Games.

[25] Peter I. Cowling,et al. Monte Carlo Tree Search with macro-actions and heuristic route planning for the Physical Travelling Salesman Problem , 2012, 2012 IEEE Conference on Computational Intelligence and Games (CIG).