Improved use of partial policies for identifying behavioral equivalence

Interactive multiagent decision making often requires to predict actions of other agents by solving their behavioral models from the perspective of the modeling agent. Unfortunately, the general space of models in the absence of constraining assumptions tends to be very large thereby making multiagent decision making intractable. One approach that can reduce the model space is to cluster behaviorally equivalent models that exhibit identical policies over the whole planning horizon. Currently, the state of the art on identifying equivalence of behavioral models compares partial policy trees instead of entire trees. In this paper, we further improve the use of partial trees for the identification purpose and develop an incremental comparison strategy in order to efficiently ascertain the model equivalence. We investigate the improved approach in a well-defined probabilistic graphical model for sequential multiagent decision making - interactive dynamic influence diagrams, and evaluate its performance over multiple problem domains.

[1]  Ross D. Shachter,et al.  Dynamic programming and influence diagrams , 1990, IEEE Trans. Syst. Man Cybern..

[2]  Xavier Boyen,et al.  Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[3]  Daphne Koller,et al.  Multi-Agent Influence Diagrams for Representing and Solving Games , 2001, IJCAI.

[4]  Yifeng Zeng,et al.  Epsilon-Subjective Equivalence of Models for Interactive Dynamic Influence Diagrams , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[5]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[6]  Yifeng Zeng,et al.  Graphical models for interactive POMDPs: representations and solutions , 2009, Autonomous Agents and Multi-Agent Systems.

[7]  Ya'akov Gal,et al.  A language for modeling agents' decision making processes in games , 2003, AAMAS '03.

[8]  Jian Luo,et al.  Utilizing Partial Policies for Identifying Equivalence of Behavioral Models , 2011, AAAI.

[9]  Yingke Chen,et al.  Approximating behavioral equivalence of models using top-k policy paths , 2011, AAMAS.

[10]  Prashant Doshi,et al.  GaTAC: a scalable and realistic testbed for multiagent decision making (demonstration) , 2012, AAMAS.

[11]  Ya'akov Gal,et al.  Networks of Influence Diagrams: A Formalism for Representing Agents' Beliefs and Decision-Making Processes , 2008, J. Artif. Intell. Res..

[12]  Yifeng Zeng,et al.  Exploiting Model Equivalences for Solving Interactive Dynamic Influence Diagrams , 2014, J. Artif. Intell. Res..

[13]  Yifeng Zeng,et al.  Speeding Up Exact Solutions of Interactive Dynamic Influence Diagrams Using Action Equivalence , 2009, IJCAI.

[14]  Stephen Morris,et al.  Topologies on Types , 2005 .

[15]  Yifeng Zeng,et al.  Improved approximation of interactive dynamic influence diagrams using discriminative model updates , 2009, AAMAS.

[16]  Yifeng Zeng,et al.  Approximate Solutions of Interactive Dynamic Influence Diagrams Using Model Clustering , 2007, AAAI.

[17]  Stacy Marsella,et al.  Minimal Mental Models , 2007, AAAI.

[18]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[19]  Prashant Doshi,et al.  Exact solutions of interactive POMDPs using behavioral equivalence , 2006, AAMAS '06.