论文信息 - Understanding Sampling Style Adversarial Search Methods

Understanding Sampling Style Adversarial Search Methods

UCT has recently emerged as an exciting new adversarial reasoning technique based on cleverly balancing exploration and exploitation in a Monte-Carlo sampling setting. It has been particularly successful in the game of Go but the reasons for its success are not well understood and attempts to replicate its success in other domains such as Chess have failed. We provide an in-depth analysis of the potential of UCT in domain-independent settings, in cases where heuristic values are available, and the effect of enhancing random playouts to more informed playouts between two weak minimax players. To provide further insights, we develop synthetic game tree instances and discuss interesting properties of UCT, both empirically and analytically.

Bart Selman | Ashish Sabharwal | Raghuram Ramanujan

[1] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.

[2] Michael C. Fu,et al. An Adaptive Sampling Algorithm for Solving Markov Decision Processes , 2005, Oper. Res..

[3] David Silver,et al. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Achieving Master Level Play in 9 × 9 Computer Go , 2022 .

[4] Judea Pearl,et al. On the Nature of Pathology in Game Searching , 1983, Artif. Intell..

[5] Matthew L. Ginsberg,et al. GIB: Steps Toward an Expert-Level Bridge-Playing Program , 1999, IJCAI.

[6] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[7] Dana S. Nau,et al. Pathology on Game Trees Revisited, and an Alternative to Minimaxing , 1983, Artif. Intell..

[8] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.

[9] Bart Selman,et al. On Adversarial Search Spaces and Sampling-Based Planning , 2010, ICAPS.

[10] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.

[11] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[12] Brian Sheppard,et al. World-championship-caliber Scrabble , 2002, Artif. Intell..