Diverse Agents for Ad-Hoc Cooperation in Hanabi

In complex scenarios where a model of other actors is necessary to predict and interpret their actions, it is often desirable that the model works well with a wide variety of previously unknown actors. Hanabi is a card game that brings the problem of modeling other players to the forefront, but there is no agreement on how to best generate a pool of agents to use as partners in ad-hoc cooperation evaluation. This paper proposes Quality Diversity algorithms as a promising class of algorithms to generate populations for this purpose and shows an initial implementation of an agent generator based on this idea. We also discuss what metrics can be used to compare such generators, and how the proposed generator could be leveraged to help build adaptive agents for the game.

[1]  Murray Campbell,et al.  Deep Blue , 2002, Artif. Intell..

[2]  Simon M. Lucas,et al.  Evaluating and modelling Hanabi-playing agents , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[3]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[4]  C. Cox,et al.  How to Make the Perfect Fireworks Display: Two Strategies for Hanabi , 2015 .

[5]  Julian Togelius,et al.  Towards Game-based Metrics for Computational Co-Creativity , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[6]  H. Francis Song,et al.  The Hanabi Challenge: A New Frontier for AI Research , 2019, Artif. Intell..

[7]  Chris Martens,et al.  An intentional AI for hanabi , 2017, 2017 IEEE Conference on Computational Intelligence and Games (CIG).

[8]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[9]  Thore Graepel,et al.  Re-evaluating evaluation , 2018, NeurIPS.

[10]  Bruno Bouzy,et al.  Playing Hanabi Near-Optimally , 2017, ACG.

[11]  Kenneth O. Stanley,et al.  Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.

[12]  Kenneth O. Stanley,et al.  Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.

[13]  Peter Stone,et al.  Autonomous agents modelling other agents: A comprehensive survey and open problems , 2017, Artif. Intell..

[14]  Hirotaka Osawa,et al.  Solving Hanabi: Estimating Hands by Opponent's Actions in Cooperative Game with Incomplete Information , 2015, AAAI Workshop: Computer Poker and Imperfect Information.

[15]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[16]  Walter A. Kosters,et al.  Aspects of the Cooperative Card Game Hanabi , 2016, BNCAI.

[17]  H. Francis Song,et al.  Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[18]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[19]  Jean-Baptiste Mouret,et al.  Illuminating search spaces by mapping elites , 2015, ArXiv.

[20]  Sarit Kraus,et al.  Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.

[21]  James Goodman,et al.  Re-determinizing Information Set Monte Carlo Tree Search in Hanabi , 2019, ArXiv.

[22]  A. Shamsai,et al.  Multi-objective Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[23]  Julian Togelius,et al.  Evolving Agents for the Hanabi 2018 CIG Competition , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).

[24]  M. Tomasello,et al.  Does the chimpanzee have a theory of mind? 30 years later , 2008, Trends in Cognitive Sciences.