A Simulation-Based General Game Player

Abstract—The aim of General Game Playing (GGP) is tocreate intelligent agents that can automatically learn how to playmany different games at an expert level without any humanintervention. The traditional design model for GGP agents hasbeen to use a minimax-based game-tree search augmented withan automatically learned heuristic evaluation function. The firstsuccessful GGP agents all followed that approach. In here wedescribe C ADIA P LAYER , a GGP agent employing a radically dif-ferent approach: instead of a traditional game-tree search it usesMonte-Carlo simulations for its move decisions. Furthermore,we empirically evaluate different simulation-based approacheson a wide variety of games; introduce a domain-independent en-hancement for automatically learning search-control knowledgeto guide the simulation playouts; and show how to adapt thesimulation searches to be more effective in single-agent games.C ADIA P LAYER has already proven its effectiveness by winningthe 2007 and 2008 AAAI GGP competitions.Index Terms—Artificial intelligence, Games, Monte Carlomethods, Search methods.

[1]  Rémi Coulom,et al.  Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.

[2]  H. Jaap van den Herik,et al.  Parallel Monte-Carlo Tree Search , 2008, Computers and Games.

[3]  Peter Stone,et al.  Graph-Based Domain Mapping for Transfer Learning in General Games , 2007, ECML.

[4]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[5]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[6]  Peter Stone,et al.  Automatic Heuristic Construction in a Complete General Game Player , 2006, AAAI.

[7]  Hilmar Finnsson,et al.  CADIA-Player : a general game playing agent , 2007 .

[8]  Michael R. Genesereth,et al.  General Game Playing: Overview of the AAAI Competition , 2005, AI Mag..

[9]  B. Pell A STRATEGIC METAGAME PLAYER FOR GENERAL CHESS‐LIKE GAMES , 1994, Comput. Intell..

[10]  Risto Miikkulainen,et al.  Coevolving Strategies for General Game Playing , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.

[11]  Kazunori Yamaguchi,et al.  Automatic Feature Construction and Optimization for General Game Player , 2001 .

[12]  Olivier Teytaud,et al.  Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[13]  Bikramjit Banerjee,et al.  General Game Learning Using Knowledge Transfer , 2007, IJCAI.

[14]  Tony Marsland,et al.  Selective depth-first game-tree search , 2002 .

[15]  Jonathan Schaeffer,et al.  The History Heuristic and Alpha-Beta Search Enhancements in Practice , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Stephan Schiffel,et al.  Fluxplayer: A Successful General Game Player , 2007, AAAI.

[17]  David Silver,et al.  Combining online and offline knowledge in UCT , 2007, ICML '07.

[18]  James E. Clune,et al.  Heuristic Evaluation Functions for General Game Playing , 2007, KI - Künstliche Intelligenz.

[19]  Stephan Schiffel,et al.  Automatic Construction of a Heuristic Search Function for General Game Playing , 2006 .

[20]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[21]  Alexander Reinefeld,et al.  Enhanced Iterative-Deepening Search , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Nicolas Jouandeau,et al.  A Parallel Monte-Carlo Tree Search Algorithm , 2008, Computers and Games.

[23]  M. R. Genesereth,et al.  Knowledge Interchange Format Version 3.0 Reference Manual , 1992, LICS 1992.

[24]  M. Buro,et al.  HOW MACHINES HAVE REARNEA TO PLAY OTHELLO , 1999 .

[25]  Murray Campbell,et al.  Deep Blue , 2002, Artif. Intell..

[26]  Jonathan Schaeffer,et al.  One jump ahead - challenging human supremacy in checkers , 1997, J. Int. Comput. Games Assoc..

[27]  Bikramjit Banerjee and Gregory Kuhlmann and Peter Stone Value Function Transfer for General Game Playing , 2006 .

[28]  Yngvi Björnsson,et al.  Simulation-Based Approach to General Game Playing , 2008, AAAI.