Individual Planning in Agent Populations: Exploiting Anonymity and Frame-Action Hypergraphs

Interactive partially observable Markov decision processes (I-POMDP) provide a formal framework for planning for a self-interested agent in multiagent settings. An agent operating in a multiagent environment must deliberate about the actions that other agents may take and the effect these actions have on the environment and the rewards it receives. Traditional I-POMDPs model this dependence on the actions of other agents using joint action and model spaces. Therefore, the solution complexity grows exponentially with the number of agents thereby complicating scalability. In this paper, we model and extend anonymity and context-specific independence -- problem structures often present in agent populations -- for computational gain. We empirically demonstrate the efficiency from exploiting these problem structures by solving a new multiagent problem involving more than 1,000 agents.

[1]  Craig Boutilier,et al.  Bounded Finite State Controllers , 2003, NIPS.

[2]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[3]  John J. Nitao,et al.  Towards Applying Interactive POMDPs to Real-World Adversary Modeling , 2010, IAAI.

[4]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[5]  Shih-Fen Cheng,et al.  Decision Support for Agent Populations in Uncertain and Congested Environments , 2012, AAAI.

[6]  Kevin Leyton-Brown,et al.  Action-Graph Games , 2011, Games Econ. Behav..

[7]  Michael L. Littman,et al.  Using iterated reasoning to predict opponent strategies , 2011, AAMAS.

[8]  Daphne Koller,et al.  Multi-Agent Influence Diagrams for Representing and Solving Games , 2001, IJCAI.

[9]  Prashant Doshi,et al.  Scalable solutions of interactive POMDPs using generalized and bounded policy iteration , 2014, Autonomous Agents and Multi-Agent Systems.

[10]  Kevin Leyton-Brown,et al.  Temporal Action-Graph Games: A New Representation for Dynamic Games , 2009, UAI.

[11]  Prashant Doshi,et al.  Generalized Point Based Value Iteration for Interactive POMDPs , 2008, AAAI.

[12]  Tim Roughgarden,et al.  How bad is selfish routing? , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[13]  Shlomo Zilberstein,et al.  Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.

[14]  Robert J. Wood,et al.  Learning from Humans as an I-POMDP , 2012, ArXiv.

[15]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[16]  Michael L. Littman,et al.  Graphical Models for Game Theory , 2001, UAI.

[17]  Fangju Wang,et al.  An I-POMDP Based Multi-Agent Architecture for Dialogue Tutoring , 2013 .

[18]  Kian Hsiang Low,et al.  Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents , 2013, IJCAI.

[19]  Patrick Jaillet,et al.  Decentralized Stochastic Planning with Anonymity in Interactions , 2014, AAAI.

[20]  Prashant Doshi,et al.  Modeling recursive reasoning by humans using empirically informed interactive POMDPs , 2010, AAMAS.

[21]  Kevin Leyton-Brown,et al.  Bayesian Action-Graph Games , 2010, NIPS.

[22]  Prashant Doshi,et al.  Monte Carlo Sampling Methods for Approximating Interactive POMDPs , 2014, J. Artif. Intell. Res..

[23]  Yingke Chen,et al.  Team behavior in interactive dynamic influence diagrams with applications to ad hoc teams , 2014, AAMAS.

[24]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[25]  Gilbert L. Peterson,et al.  A Trust-Based Multiagent System , 2009, 2009 International Conference on Computational Science and Engineering.