论文信息 - Exploiting Structure in Cooperative Bayesian Games - 字舞流文

Exploiting Structure in Cooperative Bayesian Games

Cooperative Bayesian games (BGs) can model decision-making problems for teams of agents under imperfect information, but require space and computation time that is exponential in the number of agents. While agent independence has been used to mitigate these problems in perfect information settings, we propose a novel approach for BGs based on the observation that BGs additionally possess a different types of structure, which we call type independence. We propose a factor graph representation that captures both forms of independence and present a theoretical analysis showing that non-serial dynamic programming cannot effectively exploit type independence, while MAX-SUM can. Experimental results demonstrate that our approach can tackle cooperative Bayesian games of unprecedented size.

Shimon Whiteson | Frans A. Oliehoek | Matthijs T. J. Spaan | F. Oliehoek | M. Spaan | Shimon Whiteson

[1] Agostino Poggi,et al. Multiagent Systems , 2006, Intelligenza Artificiale.

[2] Jeff G. Schneider,et al. Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[3] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[4] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .

[5] Nikos A. Vlassis,et al. Using the Max-Plus Algorithm for Multiagent Decision Making in Coordination Graphs , 2005, BNAIC.

[6] Brendan J. Frey,et al. Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[7] H.-A. Loeliger,et al. An introduction to factor graphs , 2004, IEEE Signal Process. Mag..

[8] K. Khalil. On the Complexity of Decentralized Decision Making and Detection Problems , 2022 .

[9] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[10] Michael P. Wellman,et al. Constraint satisfaction algorithms for graphical games , 2007, AAMAS '07.

[11] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[12] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[13] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..

[14] Frans A. Oliehoek,et al. Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion , 2011, IJCAI.

[15] Shie Mannor,et al. A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[16] Nicholas R. Jennings,et al. Bounded approximate decentralised coordination via the max-sum algorithm , 2009, Artif. Intell..

[17] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.

[18] John C. Harsanyi,et al. Games with Incomplete Information Played by "Bayesian" Players, I-III: Part I. The Basic Model& , 2004, Manag. Sci..

[19] Tommi S. Jaakkola,et al. Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations , 2007, NIPS.

[20] Michael I. Jordan,et al. Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[21] Rina Dechter,et al. Bucket Elimination: A Unifying Framework for Reasoning , 1999, Artif. Intell..

[22] Shimon Whiteson,et al. Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs , 2008, ECML/PKDD.

[23] Shimon Whiteson,et al. Exploiting locality of interaction in factored Dec-POMDPs , 2008, AAMAS.

[24] Daphne Koller,et al. Multi-Agent Influence Diagrams for Representing and Solving Games , 2001, IJCAI.

[25] Daphne Koller,et al. A Continuation Method for Nash Equilibria in Structured Games , 2003, IJCAI.

[26] Rina Dechter,et al. AND/OR Branch-and-Bound search for combinatorial optimization in graphical models , 2009, Artif. Intell..

[27] Daphne Koller,et al. Multi-agent algorithms for solving graphical games , 2002, AAAI/IAAI.

[28] Frans A. Oliehoek,et al. The MultiAgent Decision Process toolbox: Software for decision-theoretic planning in multiagent-systems , 2008 .

[29] Shlomo Zilberstein,et al. Point-based backup for decentralized POMDPs: complexity and new algorithms , 2010, AAMAS.

[30] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .

[31] Michael L. Littman,et al. Graphical Models for Game Theory , 2001, UAI.

[32] Gerhard Weiss,et al. Multiagent Systems , 1999 .

[33] J. Harsanyi. Games with Incomplete Information Played by 'Bayesian' Players, Part III. The Basic Probability Distribution of the Game , 1968 .

[34] M. Kearns,et al. Algorithmic Game Theory: Graphical Games , 2007 .

[35] Alan H. Bond,et al. Distributed Artificial Intelligence , 1988 .

[36] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[37] L. Buşoniu,et al. A comprehensive survey of multi-agent reinforcement learning , 2011 .

[38] Simon de Givry,et al. Existential arc consistency: Getting closer to full arc consistency in weighted CSPs , 2005, IJCAI.

[39] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[40] Martin J. Wainwright,et al. Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..

[41] Derek G. Corneil,et al. Complexity of finding embeddings in a k -tree , 1987 .

[42] Yang Gao,et al. Distributed Artificial Intelligence: Second International Conference, DAI 2020, Nanjing, China, October 24–27, 2020, Proceedings , 2020, DAI.

[43] Nikos Vlassis,et al. A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence I Mobk077-fm Synthesis Lectures on Artificial Intelligence and Machine Learning a Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence a Concise Introduction to Multiagent Systems and D , 2007 .

[44] Nicholas R. Jennings,et al. Decentralised coordination of low-power embedded devices using the max-sum algorithm , 2008, AAMAS.

[45] Frans A. Oliehoek,et al. Heuristic search for identical payoff Bayesian games , 2010, AAMAS.

[46] Nikos A. Vlassis,et al. Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..

[47] Johan Kwisthout,et al. Most probable explanations in Bayesian networks: Complexity and tractability , 2011, Int. J. Approx. Reason..

[48] Frans A. Oliehoek,et al. Value-Based Planning for Teams of Agents in Stochastic Partially Observable Environments , 2010 .

[49] Kevin Leyton-Brown,et al. Bayesian Action-Graph Games , 2010, NIPS.

[50] Umberto Bertelè,et al. Nonserial Dynamic Programming , 1972 .

[51] Tony Jebara,et al. MAP Estimation, Message Passing, and Perfect Graphs , 2009, UAI.