Solving Hanabi: Estimating Hands by Opponent's Actions in Cooperative Game with Incomplete Information

A unique behavior of humans is modifying one’s unobservable behavior based on the reaction of others for cooperation. We used a card game called Hanabi as an evaluation task of imitating human reflective intelligence with artificial intelligence. Hanabi is a cooperative card game with incomplete information. A player cooperates with an opponent in building several card sets constructed with the same color and ordered numbers. However, like a blind man's bluff, each player sees the cards of all other players except his/her own. Also, communication between players is restricted to information about the same numbers and colors, and the player is required to read his/his opponent's intention with the opponent's hand, estimate his/her cards with incomplete information, and play one of them for building a set. We compared human play with several simulated strategies. The results indicate that the strategy with feedbacks from simulated opponent's viewpoints achieves more score than other strategies.

[1]  Michael C. Frank,et al.  Predicting Pragmatic Reasoning in Language Games , 2012, Science.

[2]  Michèle Sebag,et al.  The grand challenge of computer Go , 2012, Commun. ACM.

[3]  Xiaoling Wu,et al.  Biomimetic modeling and three-dimension reconstruction of the artificial bone , 2007, Comput. Methods Programs Biomed..

[4]  Peter I. Cowling,et al.  Determinization and information set Monte Carlo Tree Search for the card game Dou Di Zhu , 2011, 2011 IEEE Conference on Computational Intelligence and Games (CIG'11).

[5]  Takayuki Kanda,et al.  Developing a model of robot behavior to identify and appropriately respond to implicit attention-shifting , 2009, 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[6]  R. Byrne,et al.  Machiavellian intelligence : social expertise and the evolution of intellect in monkeys, apes, and humans , 1990 .

[7]  Krzysztof Krawiec,et al.  Learning n-tuple networks for othello by coevolutionary gradient search , 2011, GECCO '11.

[8]  Tuomas Sandholm,et al.  Game theory-based opponent modeling in large imperfect-information games , 2011, AAMAS.

[9]  Jonathan Schaeffer,et al.  Opponent Modeling in Poker , 1998, AAAI/IAAI.

[10]  Jeffrey S. Rosenschein,et al.  Incomplete Information and Deception in Multi-Agent Negotiation , 1991, IJCAI.

[11]  Matthew L. Ginsberg,et al.  GIB: Imperfect Information in a Computationally Challenging Game , 2011, J. Artif. Intell. Res..

[12]  Bruce Abramson,et al.  Control strategies for two-player games , 1989, CSUR.

[13]  Laura M. Hiatt,et al.  A Cognitive Model of Theory of Mind , 2010 .

[14]  Jonathan Schaeffer,et al.  Approximating Game-Theoretic Optimal Strategies for Full-scale Poker , 2003, IJCAI.

[15]  Karl Wiirneryd Evolutionary stability in unanimity games with cheap talk , 1991 .