A Concise Introduction to Decentralized POMDPs
暂无分享,去创建一个
[1] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.
[2] J. Marschak,et al. Elements for a Theory of Teams , 1955 .
[3] R. Radner,et al. Team Decision Problems , 1962 .
[4] H. Witsenhausen. Separation of estimation and control for discrete time systems , 1971 .
[5] R. Radner,et al. Economic theory of teams , 1972 .
[6] J. Walrand,et al. On delayed sharing patterns , 1978 .
[7] S. Marcus,et al. Decentralized control of finite state Markov processes , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.
[8] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[9] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[10] Munindar P. Singh. Multiagent Systems - A Theoretical Framework for Intentions, Know-How, and Communications , 1994, Lecture Notes in Computer Science.
[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[12] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[13] Ariel Rubinstein,et al. A Course in Game Theory , 1995 .
[14] Katia P. Sycara,et al. Exploiting Problem Structure for Distributed Constraint Optimization , 1995, ICMAS.
[15] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[16] Nicholas R. Jennings,et al. Controlling Cooperative Problem Solving in Industrial Multi-Agent Systems Using Joint Intentions , 1995, Artif. Intell..
[17] Anand S. Rao,et al. BDI Agents: From Theory to Practice , 1995, ICMAS.
[18] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[19] Nicholas R. Jennings,et al. Intelligent agents: theory and practice , 1995, The Knowledge Engineering Review.
[20] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[21] G. W. Wornell,et al. Decentralized control of a multiple access broadcast channel: performance bounds , 1996, Proceedings of 35th IEEE Conference on Decision and Control.
[22] Michael P. Georgeff,et al. Modelling and Design of Multi-Agent Systems , 1997, ATAL.
[23] M. Yokoo,et al. Distributed Breakout Algorithm for Solving Distributed Constraint Satisfaction Problems , 1996 .
[24] Milind Tambe,et al. Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..
[25] Avi Pfeffer,et al. Representations and Solutions for Game-Theoretic Problems , 1997, Artif. Intell..
[26] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[27] Nicholas R. Jennings,et al. Agent-Based Computing: Promise and Perils , 1999, IJCAI.
[28] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[29] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[30] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[31] Hiroaki Kitano,et al. RoboCup Rescue: search and rescue in large-scale disasters as a domain for autonomous agents research , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).
[32] Victor R. Lesser,et al. Cooperative Multiagent Systems: A Personal View of the State of the Art , 1999, IEEE Trans. Knowl. Data Eng..
[33] Manuela M. Veloso,et al. Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..
[34] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[35] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[36] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[37] Marco Wiering,et al. Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .
[38] Jesse Hoey,et al. APRICODD: Approximate Policy Construction Using Decision Diagrams , 2000, NIPS.
[39] Michael L. Littman,et al. Graphical Models for Game Theory , 2001, UAI.
[40] Lex Weaver,et al. A Multi-Agent Policy-Gradient Approach to Network Routing , 2001, ICML.
[41] Julie A. Adams,et al. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence , 2001, AI Mag..
[42] Victor R. Lesser,et al. Communication decisions in multi-agent cooperation: model and experiments , 2001, AGENTS '01.
[43] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[44] 相場亮. Distributed Constraint Satisfaction: Foundations of Cooperation in Multi - Agent Systems , 2001 .
[45] Milind Tambe,et al. Team Formation for Reformation in Multiagent Domains Like RoboCupRescue , 2002, RoboCup.
[46] Leslie Pack Kaelbling,et al. Reinforcement Learning by Policy Search , 2002 .
[47] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.
[48] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..
[49] Milind Tambe,et al. Team Formation for Reformation , 2002 .
[50] Milind Tambe,et al. Role allocation and reallocation in multiagent teams: towards a practical analysis , 2003, AAMAS '03.
[51] Sebastian Thrun,et al. Planning under Uncertainty for Reliable Health Care Robotics , 2003, FSR.
[52] Barbara Messing,et al. An Introduction to MultiAgent Systems , 2002, Künstliche Intell..
[53] Shie Mannor,et al. The Cross Entropy Method for Fast Policy Search , 2003, ICML.
[54] Claudia V. Goldman,et al. The complexity of multiagent systems: the price of silence , 2003, AAMAS '03.
[55] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[56] David V. Pynadath,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.
[57] A. Koopman,et al. Simulation and optimization of traffic in a city , 2004, IEEE Intelligent Vehicles Symposium, 2004.
[58] Makoto Yokoo,et al. Communications for improving policy computation in distributed POMDPs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[59] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[60] Leslie Pack Kaelbling,et al. Representing hierarchical POMDPs as DBNs for multi-scale robot localization , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[61] Milind Tambe,et al. An Automated Teamwork Infrastructure for Heterogeneous Software Agents and Humans , 2003, Autonomous Agents and Multi-Agent Systems.
[62] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..
[63] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.
[64] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[65] Manuela M. Veloso,et al. Reasoning about joint beliefs for execution-time communication decisions , 2005, AAMAS '05.
[66] Milind Tambe,et al. Hybrid BDI-POMDP Framework for Multiagent Teaming , 2011, J. Artif. Intell. Res..
[67] Alberto RibesAbstract,et al. Multi agent systems , 2019, Proceedings of the 2005 International Conference on Active Media Technology, 2005. (AMT 2005)..
[68] Karl Tuyls,et al. An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.
[69] François Charpillet,et al. MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.
[70] Manuela Veloso,et al. Decentralized Communication Strategies for Coordinated Multi-Agent Policies , 2005 .
[71] Cees Witteveen,et al. Multi-agent Planning An introduction to planning and coordination , 2005 .
[72] Joelle Pineau,et al. POMDP Planning for Robust Robot Control , 2005, ISRR.
[73] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[74] François Charpillet,et al. An Optimal Best-First Search Algorithm for Solving Infinite Horizon DEC-POMDPs , 2005, ECML.
[75] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .
[76] Nikos A. Vlassis,et al. Using the Max-Plus Algorithm for Multiagent Decision Making in Coordination Graphs , 2005, BNAIC.
[77] Brahim Chaib-draa,et al. An online POMDP algorithm for complex multiagent environments , 2005, AAMAS '05.
[78] Makoto Yokoo,et al. Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..
[79] Prashant Doshi,et al. Exact solutions of interactive POMDPs using behavioral equivalence , 2006, AAMAS '06.
[80] Victor R. Lesser,et al. Agent interaction in distributed POMDPs and its implications on complexity , 2006, AAMAS '06.
[81] Makoto Yokoo,et al. Exploiting Locality of Interaction in Networked Distributed POMDPs , 2006, AAAI Spring Symposium: Distributed Plan and Schedule Management.
[82] Nikos A. Vlassis,et al. Decentralized planning under uncertainty for teams of communicating agents , 2006, AAMAS '06.
[83] Frans A. Oliehoek,et al. A hierarchical model for decentralized fighting of large scale urban fires , 2006 .
[84] Nikos A. Vlassis,et al. Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..
[85] François Charpillet,et al. Point-based Dynamic Programming for DEC-POMDPs , 2006, AAAI.
[86] Agostino Poggi,et al. Multiagent Systems , 2006, Intelligenza Artificiale.
[87] Ben J. A. Kröse,et al. Dynamic Bayesian Networks for Visual Surveillance with Distributed Cameras , 2006, EuroSSC.
[88] Marc Toussaint,et al. Probabilistic inference for solving (PO) MDPs , 2006 .
[89] Makoto Yokoo,et al. Winning back the CUP for distributed POMDPs: planning over continuous belief spaces , 2006, AAMAS '06.
[90] Shlomo Zilberstein,et al. Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.
[91] Shlomo Zilberstein,et al. Memory-Bounded Dynamic Programming for DEC-POMDPs , 2007, IJCAI.
[92] Frans A. Oliehoek,et al. Dec-POMDPs with delayed communication , 2007 .
[93] Marek Petrik,et al. Average-Reward Decentralized Markov Decision Processes , 2007, IJCAI.
[94] Trey Smith,et al. Probabilistic planning for robotic exploration , 2007 .
[95] Stacy Marsella,et al. Minimal Mental Models , 2007, AAAI.
[96] Manuela M. Veloso,et al. Exploiting factored representations for decentralized execution in multiagent teams , 2007, AAMAS '07.
[97] Shlomo Zilberstein,et al. Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs , 2007, UAI.
[98] Makoto Yokoo,et al. Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies , 2007, AAMAS '07.
[99] Leslie Pack Kaelbling,et al. Automated Design of Adaptive Controllers for Modular Robots using Reinforcement Learning , 2008, Int. J. Robotics Res..
[100] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[101] Nikos A. Vlassis,et al. Multiagent Planning Under Uncertainty with Stochastic Communication Delays , 2008, ICAPS.
[102] Yoav Shoham,et al. Essentials of Game Theory: A Concise Multidisciplinary Introduction , 2008, Essentials of Game Theory: A Concise Multidisciplinary Introduction.
[103] Makoto Yokoo,et al. Not all agents are equal: scaling up distributed POMDPs for agent networks , 2008, AAMAS.
[104] Ronald T. van Katwijk,et al. Multi-Agent Look-Ahead Traffic-Adaptive Control , 2008 .
[105] Francisco S. Melo,et al. Interaction-driven Markov games for decentralized multiagent planning under uncertainty , 2008, AAMAS.
[106] Nikos A. Vlassis,et al. The Cross-Entropy Method for Policy Search in Decentralized POMDPs , 2008, Informatica.
[107] Leslie Pack Kaelbling,et al. Multi-Agent Filtering with Infinitely Nested Beliefs , 2008, NIPS.
[108] Shimon Whiteson,et al. Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs , 2008, ECML/PKDD.
[109] Marc Toussaint,et al. Hierarchical POMDP Controller Optimization by Likelihood Maximization , 2008, UAI.
[110] Richard R. Brooks,et al. Distributed Sensor Networks: A Multiagent Perspective , 2008 .
[111] Shimon Whiteson,et al. Exploiting locality of interaction in factored Dec-POMDPs , 2008, AAMAS.
[112] Milind Tambe,et al. Exploiting Coordination Locales in Distributed POMDPs via Social Model Shaping , 2009, ICAPS.
[113] Victor R. Lesser,et al. Offline Planning for Communication by Exploiting Structured Interactions in Decentralized MDPs , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.
[114] Marek Petrik,et al. A Bilinear Programming Approach for Multiagent Planning , 2009, J. Artif. Intell. Res..
[115] Rina Dechter,et al. AND/OR Branch-and-Bound search for combinatorial optimization in graphical models , 2009, Artif. Intell..
[116] Shimon Whiteson,et al. Lossless clustering of histories in decentralized POMDPs , 2009, AAMAS.
[117] Marc Toussaint,et al. Probabilistic inference as a model of planned behavior , 2009, Künstliche Intell..
[118] Mathijs de Weerdt,et al. Introduction to planning in multiagent systems , 2009, Multiagent Grid Syst..
[119] Edmund H. Durfee,et al. Flexible approximation of structured interactions in decentralized Markov decision processes , 2009, AAMAS.
[120] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .
[121] Shlomo Zilberstein,et al. Constraint-based dynamic programming for decentralized POMDPs with structured interactions , 2009, AAMAS.
[122] Nikos Vlassis,et al. A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence I Mobk077-fm Synthesis Lectures on Artificial Intelligence and Machine Learning a Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence a Concise Introduction to Multiagent Systems and D , 2007 .
[123] Victor R. Lesser,et al. Self-organization for coordinating decentralized reinforcement learning , 2010, AAMAS.
[124] Frans C. A. Groen,et al. A Distributed Approach to Gas Detection and Source Localization Using Heterogeneous Information , 2010, Interactive Collaborative Information Systems.
[125] Michael L. Littman,et al. Classes of Multiagent Q-learning Dynamics with epsilon-greedy Exploration , 2010, ICML.
[126] Kavi Kumar Khedo,et al. A Wireless Sensor Network Air Pollution Monitoring System , 2010, ArXiv.
[127] Feng Wu,et al. Point-based policy generation for decentralized POMDPs , 2010, AAMAS.
[128] Shlomo Zilberstein,et al. Anytime Planning for Decentralized POMDPs using Expectation Maximization , 2010, UAI.
[129] Edmund H. Durfee,et al. Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.
[130] Karl Tuyls,et al. Frequency adjusted multi-agent Q-learning , 2010, AAMAS.
[131] Frans A. Oliehoek,et al. Value-Based Planning for Teams of Agents in Stochastic Partially Observable Environments , 2010 .
[132] Pascal Poupart,et al. Partially Observable Markov Decision Processes , 2010, Encyclopedia of Machine Learning.
[133] Feng Wu,et al. Rollout Sampling Policy Iteration for Decentralized POMDPs , 2010, UAI.
[134] Edmund H. Durfee,et al. From policies to influences: a framework for nonlocal abstraction in transition-dependent Dec-POMDP agents , 2010, AAMAS.
[135] Shlomo Zilberstein,et al. Point-based backup for decentralized POMDPs: complexity and new algorithms , 2010, AAMAS.
[136] Frans A. Oliehoek,et al. Heuristic search for identical payoff Bayesian games , 2010, AAMAS.
[137] Edmund H. Durfee,et al. Towards a unifying characterization for quantifying weak coupling in dec-POMDPs , 2011, AAMAS.
[138] Jaakko Peltonen,et al. Efficient Planning for Factored Infinite-Horizon DEC-POMDPs , 2011, IJCAI.
[139] Edmund H. Durfee,et al. Abstracting Influences for Efficient Multiagent Coordination Under Uncertainty , 2011 .
[140] Marc Toussaint,et al. Scalable Multiagent Planning Using Probabilistic Inference , 2011, IJCAI.
[141] V. Lesser,et al. A Compact Mathematical Formulation For Problems With Structured Agent Interactions , 2011 .
[142] Pedro U. Lima,et al. Efficient Offline Communication Policies for Factored Multiagent POMDPs , 2011, NIPS.
[143] Frans A. Oliehoek,et al. Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion , 2011, IJCAI.
[144] Nicholas R. Jennings,et al. Bounded approximate decentralised coordination via the max-sum algorithm , 2009, Artif. Intell..
[145] Prasanna Velagapudi,et al. Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents , 2011, AAMAS.
[146] Jian Luo,et al. Utilizing Partial Policies for Identifying Equivalence of Behavioral Models , 2011, AAAI.
[147] Ashutosh Nayyar,et al. Optimal Control Strategies in Delayed Sharing Information Structures , 2010, IEEE Transactions on Automatic Control.
[148] Jaakko Peltonen,et al. Periodic Finite State Controllers for Efficient POMDP and DEC-POMDP Planning , 2011, NIPS.
[149] Victor R. Lesser,et al. Compact Mathematical Programs For DEC-MDPs With Structured Agent Interactions , 2011, UAI.
[150] Milind Tambe,et al. Security and Game Theory - Algorithms, Deployed Systems, Lessons Learned , 2011 .
[151] Gerhard Weiss,et al. Multiagent Learning: Basics, Challenges, and Prospects , 2012, AI Mag..
[152] Leslie Pack Kaelbling,et al. Heuristic search of multiagent influence space , 2012, AAMAS.
[153] Leslie Pack Kaelbling,et al. Integrated robot task and motion planning in belief space , 2012 .
[154] Leslie Pack Kaelbling,et al. Influence-Based Abstraction for Multiagent Systems , 2012, AAAI.
[155] Frans A. Oliehoek,et al. Decentralized POMDPs , 2012, Reinforcement Learning.
[156] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[157] David Barber,et al. On the Computational Complexity of Stochastic Controller Optimization in POMDPs , 2011, TOCT.
[158] Shimon Whiteson,et al. Exploiting Structure in Cooperative Bayesian Games , 2012, UAI.
[159] Frans A. Oliehoek,et al. Sufficient Plan-Time Statistics for Decentralized POMDPs , 2013, IJCAI.
[160] Frans A. Oliehoek,et al. Incremental clustering and expansion for faster optimal planning in decentralized POMDPs , 2013 .
[161] Charles L. Isbell,et al. Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs , 2013, NIPS.
[162] Ashutosh Nayyar,et al. Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach , 2012, IEEE Transactions on Automatic Control.
[163] Feng Wu,et al. Monte-Carlo Expectation Maximization for Decentralized POMDPs , 2013, IJCAI.
[164] Leslie Pack Kaelbling,et al. Integrated task and motion planning in belief space , 2013, Int. J. Robotics Res..
[165] Hari Balakrishnan,et al. TCP ex machina: computer-generated congestion control , 2013, SIGCOMM.
[166] Shimon Whiteson,et al. Approximate solutions for factored Dec-POMDPs with many agents , 2013, AAMAS.
[167] Jaakko Peltonen,et al. Expectation Maximization for Average Reward Decentralized POMDPs , 2013, ECML/PKDD.
[168] Ashutosh Nayyar,et al. The Common-Information Approach to Decentralized Stochastic Control , 2014 .
[169] Milind Tambe,et al. Unleashing Dec-MDPs in Security Games: Enabling Effective Defender Teamwork , 2014, ECAI.
[170] Yi Ouyang,et al. Balancing through signaling in decentralized routing , 2014, 53rd IEEE Conference on Decision and Control.
[171] Dec-POMDPs as Non-Observable MDPs , 2014 .
[172] Frans A. Oliehoek,et al. Influence-Optimistic Local Values for Multiagent Planning - Extended Version , 2015, ArXiv.
[173] Alborz Geramifard,et al. Decentralized control of Partially Observable Markov Decision Processes using belief space macro-actions , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[174] Frans A. Oliehoek,et al. Structure in the value function of zero-sum games of incomplete information , 2015 .
[175] Jonathan P. How,et al. Decision Making Under Uncertainty: Theory and Application , 2015 .
[176] Frans A. Oliehoek,et al. Factored Upper Bounds for Multiagent Planning Problems under Uncertainty with Non-Factored Value Functions , 2015, IJCAI.
[177] Shimon Whiteson,et al. Exploiting Submodular Value Functions for Faster Dynamic Sensor Selection , 2015, AAAI.
[178] Aditya Mahajan,et al. Decentralized stochastic control , 2013, Annals of Operations Research.