Decision-Theoretic Planning in Multiagent Settings with Application to Behavioral Modeling

Optimal planning in environments shared with other interacting agents often involves recognizing the intent of others and their plans. This is because others’ actions may impact the state of the environment and, consequently, the efficacy of the agent’s plan. Planning becomes further complicated in the presence of uncertainty, which may manifest due to the state being partially observable, nondeterminism in how the dynamic state changes, and imperfect sensors. A framework for decision–theoretic planning in this space is the interactive partially observable Markov decision process (I-POMDP), which generalizes the well-known POMDP to multiagent settings. This chapter describes the general I-POMDP framework and a particular approximation that facilitates its usage. Because I-POMDPs elegantly integrate beliefs and the modeling of others in the subject agent’s decision process, they apply to modeling human behavioral data obtained from strategic games. We explore the effectiveness of models based on simplified I-POMDPs in fitting experimental data onto theory-of-mind–based recursive reasoning.

[1]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[2]  Colin Camerer,et al.  A Cognitive Hierarchy Model of Games , 2004 .

[3]  Robert P. Goldman,et al.  Recognizing Plan/Goal Abandonment , 2003, IJCAI.

[4]  Daniel Friedman,et al.  Individual Learning in Normal Form Games: Some Laboratory Results☆☆☆ , 1997 .

[5]  Adam Brandenburger,et al.  The power of paradox: some recent developments in interactive epistemology , 2007, Int. J. Game Theory.

[6]  D. Stahl,et al.  On Players' Models of Other Players: Theory and Experimental Evidence , 1995 .

[7]  Ya'akov Gal,et al.  Modeling Reciprocal Behavior in Human Bilateral Negotiation , 2007, AAAI.

[8]  R. McKelvey,et al.  Quantal Response Equilibria for Normal Form Games , 1995 .

[9]  Klaus M. Schmidt,et al.  A Theory of Fairness, Competition, and Cooperation , 1999 .

[10]  Piotr J. Gmytrasiewicz,et al.  Towards strategic Kriegspiel play with opponent modeling (Extended Abstract) , 2009 .

[11]  Stacy Marsella,et al.  PsychSim: Modeling Theory of Mind with Decision-Theoretic Agents , 2005, IJCAI.

[12]  Michael L. Littman,et al.  Using iterated reasoning to predict opponent strategies , 2011, AAMAS.

[13]  John C. Harsanyi,et al.  Games with Incomplete Information Played by "Bayesian" Players, I-III: Part I. The Basic Model& , 2004, Manag. Sci..

[14]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[15]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[16]  Robert J. Wood,et al.  Learning from Humans as an I-POMDP , 2012, ArXiv.

[17]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[18]  Michael L. Littman,et al.  A tutorial on partially observable Markov decision processes , 2009 .

[19]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[20]  R. McKelvey,et al.  An experimental study of the centipede game , 1992 .

[21]  Eddie Dekel,et al.  Hierarchies of Beliefs and Common Knowledge , 1993 .

[22]  Gilbert L. Peterson,et al.  Responding to Sneaky Agents in Multi-agent Domains , 2009, FLAIRS.

[23]  Noah D. Goodman,et al.  Cause and Intent: Social Reasoning in Causal Learning , 2009 .

[24]  John J. Nitao,et al.  Towards Applying Interactive POMDPs to Real-World Adversary Modeling , 2010, IAAI.

[25]  S. Zamir,et al.  Formulation of Bayesian analysis for games with incomplete information , 1985 .

[26]  R. Rosenthal Games of perfect information, predatory pricing and the chain-store paradox , 1981 .

[27]  Gilbert L. Peterson,et al.  A Trust-Based Multiagent System , 2009, 2009 International Conference on Computational Science and Engineering.

[28]  T. Hedden,et al.  What do you think I think you think?: Strategic reasoning in matrix games , 2002, Cognition.

[29]  R. Aumann,et al.  Epistemic Conditions for Nash Equilibrium , 1995 .

[30]  Rineke Verbrugge,et al.  I Do Know What You Think I Think: Second-Order Theory Of Mind In Strategic Games Is Not That Difficult , 2011, CogSci.

[31]  Diana L. Young,et al.  Levels of theory-of-mind reasoning in competitive games , 2012 .

[32]  S. Quartz,et al.  Getting to Know You: Reputation and Trust in a Two-Person Economic Exchange , 2005, Science.

[33]  Antonio Del Giudice,et al.  Towards Strategic Kriegspiel Play with Opponent Modeling , 2007, AAAI Spring Symposium: Game Theoretic and Decision Theoretic Agents.

[34]  Colin Camerer Behavioral Game Theory: Experiments in Strategic Interaction , 2003 .

[35]  Prashant Doshi,et al.  Modeling Human Recursive Reasoning Using Empirically Informed Interactive Partially Observable Markov Decision Processes , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[36]  Joshua B. Tenenbaum,et al.  Bayesian Theory of Mind: Modeling Joint Belief-Desire Attribution , 2011, CogSci.

[37]  Robert J. Aumann,et al.  Interactive epistemology II: Probability , 1999, Int. J. Game Theory.

[38]  Robert P. Goldman,et al.  A Bayesian Model of Plan Recognition , 1993, Artif. Intell..

[39]  Avi Pfeffer,et al.  Modeling how humans reason about others with partial information , 2008, AAMAS.

[40]  R. Brafman,et al.  Contingent Planning via Heuristic Forward Search witn Implicit Belief States , 2005, ICAPS.