Recursive Monte Carlo search for imperfect information games

Perfect information Monte Carlo (PIMC) search is the method of choice for constructing strong Al systems for trick-taking card games. PIMC search evaluates moves in imperfect information games by repeatedly sampling worlds based on state inference and estimating move values by solving the corresponding perfect information scenarios. PIMC search performs well in trick-taking card games despite the fact that it suffers from the strategy fusion problem, whereby the game's information set structure is ignored because moves are evaluated opportunistically in each world. In this paper we describe imperfect information Monte Carlo (IIMC) search, which aims at mitigating this problem by basing move evaluation on more realistic playout sequences rather than perfect information move values. We show that RecPIMC - a recursive IIMC search variant based on perfect information evaluation - performs considerably better than PIMC search in a large class of synthetic imperfect information games and the popular card game of Skat, for which PIMC search is the state-of-the-art cardplay algorithm.

[1]  Michael Buro,et al.  Real-Time Opponent Modeling in Trick-Taking Card Games , 2011, IJCAI.

[2]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[3]  Nathan R. Sturtevant,et al.  An Analysis of UCT in Multi-Player Games , 2008, J. Int. Comput. Games Assoc..

[4]  Michael Buro,et al.  Using Payoff-Similarity to Speed Up Search , 2011, IJCAI.

[5]  Ian Frank,et al.  Search in Games with Incomplete Information: A Case Study Using Bridge Card Play , 1998, Artificial Intelligence.

[6]  Nathan R. Sturtevant,et al.  Feature Construction for Reinforcement Learning in Hearts , 2006, Computers and Games.

[7]  Michael Buro,et al.  Minimum Proof Graphs and Fastest-Cut-First Search Heuristics , 2009, IJCAI.

[8]  Malte Helmert,et al.  A Skat Player Based on Monte-Carlo Simulation , 2006, Computers and Games.

[9]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[10]  Tristan Cazenave,et al.  Nested Monte-Carlo Search , 2009, IJCAI.

[11]  Loo Hay Lee,et al.  Multi-objective Optimal Computing Budget Allocation , 2010 .

[12]  Nathan R. Sturtevant,et al.  Understanding the Success of Perfect Information Monte Carlo Sampling in Game Tree Search , 2010, AAAI.

[13]  Peter I. Cowling,et al.  Information Set Monte Carlo Tree Search , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[14]  Peter I. Cowling,et al.  Ensemble Determinization in Monte Carlo Tree Search for the Imperfect Information Card Game Magic: The Gathering , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[15]  Michael H. Bowling,et al.  Regret Minimization in Games with Incomplete Information , 2007, NIPS.

[16]  Matthew L. Ginsberg,et al.  GIB: Imperfect Information in a Computationally Challenging Game , 2011, J. Artif. Intell. Res..

[17]  Loo Hay Lee,et al.  Stochastic Simulation Optimization - An Optimal Computing Budget Allocation , 2010, System Engineering and Operations Research.