On Markov Games Played by Bayesian and Boundedly-Rational Players

We present a new game-theoretic framework in which Bayesian players with bounded rationality engage in a Markov game and each has private but incomplete information regarding other players’ types. Instead of utilizing Harsanyi’s abstract types and a common prior, we construct intentional player types whose structure is explicit and induces afinite-levelbelief hierarchy. We characterize an equilibrium in this game and establish the conditions for existence of the equilibrium. The computation of finding such equilibria is formalized as a constraint satisfaction problem and its efectiveness is demonstrated on two cooperative domains.

[1]  S. Zamir,et al.  Formulation of Bayesian analysis for games with incomplete information , 1985 .

[2]  Michael P. Wellman,et al.  Constraint satisfaction algorithms for graphical games , 2007, AAMAS '07.

[3]  J. Harsanyi Games with Incomplete Information Played by 'Bayesian' Players, Part III. The Basic Probability Distribution of the Game , 1968 .

[4]  Vipin Kumar,et al.  Algorithms for Constraint-Satisfaction Problems: A Survey , 1992, AI Mag..

[5]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[6]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[7]  Robert J. Aumann,et al.  Interactive epistemology I: Knowledge , 1999, Int. J. Game Theory.

[8]  Yifeng Zeng,et al.  Exploiting Model Equivalences for Solving Interactive Dynamic Influence Diagrams , 2014, J. Artif. Intell. Res..

[9]  Kevin Leyton-Brown,et al.  Beyond equilibrium: predicting human behaviour in normal form games , 2010, AAAI.

[10]  Prashant Doshi,et al.  Exact solutions of interactive POMDPs using behavioral equivalence , 2006, AAMAS '06.

[11]  Colin Camerer Behavioral Game Theory: Experiments in Strategic Interaction , 2003 .

[12]  Cheng-Zhong Qin,et al.  Finite-order type spaces and applications , 2013, J. Econ. Theory.

[13]  Zhe Liu Algorithms for Constraint Satisfaction Problems (CSPs) , 1998 .

[14]  Daphne Koller,et al.  Multi-agent algorithms for solving graphical games , 2002, AAAI/IAAI.

[15]  Subramanian Ramamoorthy,et al.  A game-theoretic model and best-response learning method for ad hoc coordination in multiagent systems , 2013, AAMAS.

[16]  Prashant Doshi,et al.  On the Difficulty of Achieving Equilibrium in Interactive POMDPs , 2006, AI&M.

[17]  Shlomo Zilberstein,et al.  Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[18]  Xuanming Su,et al.  A Dynamic Level-k Model in Sequential Games , 2012, Manag. Sci..

[19]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[20]  T. Hedden,et al.  What do you think I think you think?: Strategic reasoning in matrix games , 2002, Cognition.

[21]  Eddie Dekel,et al.  Hierarchies of Beliefs and Common Knowledge , 1993 .

[22]  Manuela Veloso,et al.  What to Communicate? Execution-Time Decision in Multi-agent POMDPs , 2006, DARS.

[23]  Diana L. Young,et al.  Levels of theory-of-mind reasoning in competitive games , 2012 .

[24]  John C. Harsanyi,et al.  Games with Incomplete Information Played by "Bayesian" Players, I-III: Part I. The Basic Model& , 2004, Manag. Sci..

[25]  Eric Maskin,et al.  Markov Perfect Equilibrium: I. Observable Actions , 2001, J. Econ. Theory.

[26]  D. Stahl,et al.  On Players' Models of Other Players: Theory and Experimental Evidence , 1995 .

[27]  W. Kets Finite Depth of Reasoning and Equilibrium Play in Games with Incomplete Information , 2013 .