论文信息 - Approximating Value Equivalence in Interactive Dynamic Influence Diagrams Using Behavioral Coverage

Approximating Value Equivalence in Interactive Dynamic Influence Diagrams Using Behavioral Coverage

Interactive dynamic influence diagrams (I-DIDs) provide an explicit way of modeling how a subject agent solves decision making problems in the presence of other agents in a common setting. To optimize its decisions, the subject agent needs to predict the other agents' behavior, that is generally obtained by solving their candidate models. This becomes extremely difficult since the model space may be rather large, and grows when the other agents act and observe over the time. A recent proposal for solving I-DIDs lies in a concept of value equivalence (VE) that shows potential advances on significantly reducing the model space. In this paper, we establish a principled framework to implement the VE techniques and propose an approximate method to compute VE of candidate models. The development offers ample opportunity of exploiting VE to further improve the scalability of I-DID solutions. We theoretically analyze properties of the approximate techniques and show empirical results in multiple problem domains.

[1] Jaakko Peltonen,et al. Efficient Planning for Factored Infinite-Horizon DEC-POMDPs , 2011, IJCAI.

[2] Yifeng Zeng,et al. Graphical models for interactive POMDPs: representations and solutions , 2009, Autonomous Agents and Multi-Agent Systems.

[3] Shimon Whiteson,et al. Exploiting locality of interaction in factored Dec-POMDPs , 2008, AAMAS.

[4] P. J. Gmytrasiewicz,et al. A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[5] Jian Luo,et al. Utilizing Partial Policies for Identifying Equivalence of Behavioral Models , 2011, AAAI.

[6] Bo Li,et al. Path planning for automated guided vehicles system via interactive dynamic influence diagrams with communication , 2011, 2011 9th IEEE International Conference on Control and Automation (ICCA).

[7] Ronald A. Howard,et al. Readings on the Principles and Applications of Decision Analysis , 1989 .

[8] Yifeng Zeng,et al. Exploiting Model Equivalences for Solving Interactive Dynamic Influence Diagrams , 2014, J. Artif. Intell. Res..

[9] Stacy Marsella,et al. Minimal Mental Models , 2007, AAAI.

[10] Prashant Doshi,et al. ǫ-Subjective Equivalence of Models for Interactive Dynamic Influence Diagrams , 2009 .

[11] Yingke Chen,et al. Iterative Online Planning in Multiagent Settings with Limited Model Spaces and PAC Guarantees , 2015, AAMAS.

[12] M. L. Fisher,et al. An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[13] Marc Cavazza,et al. Learning Behaviors in Agents Systems with Interactive Dynamic Influence Diagrams , 2015, IJCAI.

[14] Ronald A. Howard,et al. Influence Diagrams , 2005, Decis. Anal..

[15] Samir Khuller,et al. The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[16] Jose M. Such,et al. International Joint Conference on Artificial Intelligence (IJCAI) , 2016 .

[17] Ying-Dar Lin,et al. The Budgeted Maximum Coverage Problem in Partially Deployed Software Defined Networks , 2016, IEEE Transactions on Network and Service Management.

[18] Shimon Whiteson,et al. Approximate solutions for factored Dec-POMDPs with many agents , 2013, AAMAS.

[19] Marc Cavazza,et al. A Value Equivalence Approach for Solving Interactive Dynamic Influence Diagrams , 2016, AAMAS.

[20] Shlomo Zilberstein,et al. Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.

[21] Jacob W. Crandall,et al. Belief and Truth in Hypothesised Behaviours , 2015, Artif. Intell..

[22] John J. Nitao,et al. Towards Applying Interactive POMDPs to Real-World Adversary Modeling , 2010, IAAI.