论文信息 - A Concise Introduction to Decentralized POMDPs

A Concise Introduction to Decentralized POMDPs

This book introduces multiagent planning under uncertainty as formalized by decentralized partially observable Markov decision processes (Dec-POMDPs). The intended audience is researchers and graduate students working in the fields of artificial intelligence related to sequential decision making: reinforcement learning, decision-theoretic planning for single agents, classical multiagent planning, decentralized control, and operations research.

Frans A. Oliehoek | Christopher Amato | F. Oliehoek | Chris Amato

[1] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.

[2] J. Marschak,et al. Elements for a Theory of Teams , 1955 .

[3] R. Radner,et al. Team Decision Problems , 1962 .

[4] H. Witsenhausen. Separation of estimation and control for discrete time systems , 1971 .

[5] R. Radner,et al. Economic theory of teams , 1972 .

[6] J. Walrand,et al. On delayed sharing patterns , 1978 .

[7] S. Marcus,et al. Decentralized control of finite state Markov processes , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[8] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[9] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10] Munindar P. Singh. Multiagent Systems - A Theoretical Framework for Intentions, Know-How, and Communications , 1994, Lecture Notes in Computer Science.

[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[12] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.

[13] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .

[14] Katia P. Sycara,et al. Exploiting Problem Structure for Distributed Constraint Optimization , 1995, ICMAS.

[15] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[16] Nicholas R. Jennings,et al. Controlling Cooperative Problem Solving in Industrial Multi-Agent Systems Using Joint Intentions , 1995, Artif. Intell..

[17] Anand S. Rao,et al. BDI Agents: From Theory to Practice , 1995, ICMAS.

[18] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[19] Nicholas R. Jennings,et al. Intelligent agents: theory and practice , 1995, The Knowledge Engineering Review.

[20] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[21] G. W. Wornell,et al. Decentralized control of a multiple access broadcast channel: performance bounds , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[22] Michael P. Georgeff,et al. Modelling and Design of Multi-Agent Systems , 1997, ATAL.

[23] M. Yokoo,et al. Distributed Breakout Algorithm for Solving Distributed Constraint Satisfaction Problems , 1996 .

[24] Milind Tambe,et al. Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[25] Avi Pfeffer,et al. Representations and Solutions for Game-Theoretic Problems , 1997, Artif. Intell..

[26] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[27] Nicholas R. Jennings,et al. Agent-Based Computing: Promise and Perils , 1999, IJCAI.

[28] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.

[29] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[30] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.

[31] Hiroaki Kitano,et al. RoboCup Rescue: search and rescue in large-scale disasters as a domain for autonomous agents research , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[32] Victor R. Lesser,et al. Cooperative Multiagent Systems: A Personal View of the State of the Art , 1999, IEEE Trans. Knowl. Data Eng..

[33] Manuela M. Veloso,et al. Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..

[34] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.

[35] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[36] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.

[37] Marco Wiering,et al. Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .

[38] Jesse Hoey,et al. APRICODD: Approximate Policy Construction Using Decision Diagrams , 2000, NIPS.

[39] Michael L. Littman,et al. Graphical Models for Game Theory , 2001, UAI.

[40] Lex Weaver,et al. A Multi-Agent Policy-Gradient Approach to Network Routing , 2001, ICML.

[41] Julie A. Adams,et al. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence , 2001, AI Mag..

[42] Victor R. Lesser,et al. Communication decisions in multi-agent cooperation: model and experiments , 2001, AGENTS '01.

[43] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .

[44] 相場亮. Distributed Constraint Satisfaction： Foundations of Cooperation in Multi - Agent Systems , 2001 .

[45] Milind Tambe,et al. Team Formation for Reformation in Multiagent Domains Like RoboCupRescue , 2002, RoboCup.

[46] Leslie Pack Kaelbling,et al. Reinforcement Learning by Policy Search , 2002 .

[47] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.

[48] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[49] Milind Tambe,et al. Team Formation for Reformation , 2002 .

[50] Milind Tambe,et al. Role allocation and reallocation in multiagent teams: towards a practical analysis , 2003, AAMAS '03.

[51] Sebastian Thrun,et al. Planning under Uncertainty for Reliable Health Care Robotics , 2003, FSR.

[52] Barbara Messing,et al. An Introduction to MultiAgent Systems , 2002, Künstliche Intell..

[53] Shie Mannor,et al. The Cross Entropy Method for Fast Policy Search , 2003, ICML.

[54] Claudia V. Goldman,et al. The complexity of multiagent systems: the price of silence , 2003, AAMAS '03.

[55] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.

[56] David V. Pynadath,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[57] A. Koopman,et al. Simulation and optimization of traffic in a city , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[58] Makoto Yokoo,et al. Communications for improving policy computation in distributed POMDPs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[59] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.

[60] Leslie Pack Kaelbling,et al. Representing hierarchical POMDPs as DBNs for multi-scale robot localization , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[61] Milind Tambe,et al. An Automated Teamwork Infrastructure for Heterogeneous Software Agents and Humans , 2003, Autonomous Agents and Multi-Agent Systems.

[62] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..

[63] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[64] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[65] Manuela M. Veloso,et al. Reasoning about joint beliefs for execution-time communication decisions , 2005, AAMAS '05.

[66] Milind Tambe,et al. Hybrid BDI-POMDP Framework for Multiagent Teaming , 2011, J. Artif. Intell. Res..

[67] Alberto RibesAbstract,et al. Multi agent systems , 2019, Proceedings of the 2005 International Conference on Active Media Technology, 2005. (AMT 2005)..

[68] Karl Tuyls,et al. An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.

[69] François Charpillet,et al. MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.

[70] Manuela Veloso,et al. Decentralized Communication Strategies for Coordinated Multi-Agent Policies , 2005 .

[71] Cees Witteveen,et al. Multi-agent Planning An introduction to planning and coordination , 2005 .

[72] Joelle Pineau,et al. POMDP Planning for Robust Robot Control , 2005, ISRR.

[73] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[74] François Charpillet,et al. An Optimal Best-First Search Algorithm for Solving Infinite Horizon DEC-POMDPs , 2005, ECML.

[75] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .

[76] Nikos A. Vlassis,et al. Using the Max-Plus Algorithm for Multiagent Decision Making in Coordination Graphs , 2005, BNAIC.

[77] Brahim Chaib-draa,et al. An online POMDP algorithm for complex multiagent environments , 2005, AAMAS '05.

[78] Makoto Yokoo,et al. Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..

[79] Prashant Doshi,et al. Exact solutions of interactive POMDPs using behavioral equivalence , 2006, AAMAS '06.