Increasing scalability in algorithms for centralized and decentralized partially observable markov decision processes: efficient decision-making and coordination in uncertain environments
暂无分享,去创建一个
[1] Shlomo Zilberstein,et al. Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs , 2010, Autonomous Agents and Multi-Agent Systems.
[2] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[3] Yixin Chen,et al. Solving Large-Scale Nonlinear Programming Problems by Constraint Partitioning , 2005, CP.
[4] Shlomo Zilberstein,et al. Achieving goals in decentralized POMDPs , 2009, AAMAS.
[5] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[6] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[7] Marek Petrik,et al. Interaction Structure and Dimensionality Reduction in Decentralized MDPs , 2008, AAAI.
[8] S. Zilberstein,et al. Formal Models and Algorithms for Decentralized Control of Multiple Agents Technical Report UM-CS-2005-068 , 2005 .
[9] Zhengzhu Feng,et al. Dynamic Programming for POMDPs Using a Factored State Representation , 2000, AIPS.
[10] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[11] Victor R. Lesser,et al. Analyzing myopic approaches for multi-agent communication , 2005, IEEE/WIC/ACM International Conference on Intelligent Agent Technology.
[12] Claudia V. Goldman,et al. Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..
[13] Shlomo Zilberstein,et al. Incremental Policy Generation for Finite-Horizon DEC-POMDPs , 2009, ICAPS.
[14] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[15] Kee-Eung Kim,et al. Symbolic Heuristic Search Value Iteration for Factored POMDPs , 2008, AAAI.
[16] Edmund H. Durfee,et al. Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.
[17] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[18] James E. Eckles,et al. Optimum Maintenance with Incomplete Information , 1968, Oper. Res..
[19] R. Simmons,et al. Probabilistic Navigation in Partially Observable Environments , 1995 .
[20] François Charpillet,et al. MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.
[21] Brian W. Kernighan,et al. AMPL: A Modeling Language for Mathematical Programming , 1993 .
[22] Blai Bonet,et al. Solving POMDPs: RTDP-Bel vs. Point-based Algorithms , 2009, IJCAI.
[23] Feng Wu,et al. Multi-Agent Online Planning with Communication , 2009, ICAPS.
[24] Makoto Yokoo,et al. Not all agents are equal: scaling up distributed POMDPs for agent networks , 2008, AAMAS.
[25] Eric A. Hansen,et al. Synthesis of Hierarchical Finite-State Controllers for POMDPs , 2003, ICAPS.
[26] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[27] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[28] Pascal Poupart,et al. Model-based Bayesian Reinforcement Learning in Partially Observable Domains , 2008, ISAIM.
[29] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[30] Jesse Hoey,et al. Assisting persons with dementia during handwashing using a partially observable Markov decision process. , 2007, ICVS 2007.
[31] Guy Shani,et al. Model-Based Online Learning of POMDPs , 2005, ECML.
[32] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.
[33] Pascal Poupart,et al. Automated Hierarchy Discovery for Planning in Partially Observable Environments , 2006, NIPS.
[34] Reid G. Simmons,et al. Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.
[35] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.
[36] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..
[37] R. Horst,et al. Global Optimization: Deterministic Approaches , 1992 .
[38] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.
[39] Joelle Pineau,et al. Variable resolution decomposition for robotic navigation under a POMDP framework , 2010, 2010 IEEE International Conference on Robotics and Automation.
[40] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.
[41] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[42] H. Ector Geener. Classical, Probabilistic, and Contingent Planning: Three Models, One Algorithm , 2022 .
[43] Jan H. van Schuppen,et al. A Class of Team Problems with Discrete Action Spaces: Optimality Conditions Based on Multimodularity , 2000, SIAM J. Control. Optim..
[44] Victor R. Lesser,et al. Communication decisions in multi-agent cooperation: model and experiments , 2001, AGENTS '01.
[45] Eric A. Hansen,et al. Indefinite-Horizon POMDPs with Action-Based Termination , 2007, AAAI.
[46] J. Marschak,et al. Elements for a Theory of Teams , 1955 .
[47] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[48] Reid G. Simmons,et al. Point-Based POMDP Algorithms: Improved Analysis and Implementation , 2005, UAI.
[49] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[50] Claudia V. Goldman,et al. Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis , 2004, J. Artif. Intell. Res..
[51] Andrew McCallum,et al. Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..
[52] Benjamin Van Roy,et al. An approximate dynamic programming approach to decentralized control of stochastic systems , 2006 .
[53] Blai Bonet,et al. Finite-State Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs , 2010, AAAI.
[54] Shlomo Zilberstein,et al. Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs , 2007, UAI.
[55] Michael R. James,et al. Learning predictive state representations in dynamical systems without reset , 2005, ICML.
[56] Victor R. Lesser,et al. Multi-agent policies: from centralized ones to decentralized ones , 2002, AAMAS '02.
[57] Olivier Buffet,et al. Multi-Agent Systems by Incremental Gradient Reinforcement Learning , 2001, IJCAI.
[58] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..
[59] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[60] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[61] Victor R. Lesser,et al. Decentralized Markov decision processes with event-driven interactions , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[62] Shlomo Zilberstein,et al. Bounded Policy Iteration for Decentralized POMDPs , 2005, IJCAI.
[63] Shlomo Zilberstein,et al. Policy Iteration for Decentralized Control of Markov Decision Processes , 2009, J. Artif. Intell. Res..
[64] Marek Petrik,et al. Average-Reward Decentralized Markov Decision Processes , 2007, IJCAI.
[65] Guy Shani,et al. Improving Existing Fault Recovery Policies , 2009, NIPS.
[66] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[67] Shlomo Zilberstein,et al. Value-based observation compression for DEC-POMDPs , 2008, AAMAS.
[68] Brahim Chaib-draa,et al. Exact Dynamic Programming for Decentralized POMDPs with Lossless Policy Compression , 2008, ICAPS.
[69] Shlomo Zilberstein,et al. Memory-Bounded Dynamic Programming for DEC-POMDPs , 2007, IJCAI.
[70] Jeffrey D. Ullman,et al. Introduction to Automata Theory, Languages and Computation , 1979 .
[71] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[72] Jeff G. Schneider,et al. Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[73] François Charpillet,et al. An Optimal Best-First Search Algorithm for Solving Infinite Horizon DEC-POMDPs , 2005, ECML.
[74] Shlomo Zilberstein,et al. Optimizing Memory-Bounded Controllers for Decentralized POMDPs , 2007, UAI.
[75] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[76] R. Radner,et al. Team Decision Problems , 1962 .
[77] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[78] François Charpillet,et al. Mixed Integer Linear Programming for Exact Finite-Horizon Planning in Decentralized Pomdps , 2007, ICAPS.
[79] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.
[80] A. Cassandra. A Survey of POMDP Applications , 2003 .
[81] Shimon Whiteson,et al. Exploiting locality of interaction in factored Dec-POMDPs , 2008, AAMAS.
[82] Brahim Chaib-draa,et al. Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs , 2009, AAMAS.
[83] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[84] Claudia V. Goldman,et al. Optimizing information exchange in cooperative multi-agent systems , 2003, AAMAS '03.
[85] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[86] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[87] Shlomo Zilberstein,et al. Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.
[88] Manuela M. Veloso,et al. Reasoning about joint beliefs for execution-time communication decisions , 2005, AAMAS '05.
[89] Joelle Pineau,et al. Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..
[90] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[91] Martin Allen,et al. Complexity of Decentralized Control: Special Cases , 2009, NIPS.
[92] Shlomo Zilberstein,et al. Solving POMDPs using quadratically constrained linear programs , 2006, AAMAS '06.
[93] François Charpillet,et al. Point-based Dynamic Programming for DEC-POMDPs , 2006, AAAI.
[94] Anne Condon,et al. On the undecidability of probabilistic planning and related stochastic optimization problems , 2003, Artif. Intell..
[95] Marek Petrik,et al. A Bilinear Programming Approach for Multiagent Planning , 2009, J. Artif. Intell. Res..
[96] Hui Li,et al. Point-Based Policy Iteration , 2007, AAAI.
[97] Milos Hauskrecht,et al. Modeling treatment of ischemic heart disease with partially observable Markov decision processes , 1998, AMIA.
[98] Shimon Whiteson,et al. Lossless clustering of histories in decentralized POMDPs , 2009, AAMAS.
[99] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .
[100] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[101] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[102] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[103] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.
[104] Hui Li,et al. Incremental Least Squares Policy Iteration for POMDPs , 2006, AAAI.
[105] Craig Boutilier,et al. Coordination in multiagent reinforcement learning: a Bayesian approach , 2003, AAMAS '03.
[106] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[107] Michael A. Saunders,et al. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization , 2002, SIAM J. Optim..
[108] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[109] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[110] S. Patek. On partially observed stochastic shortest path problems , 2001, Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228).
[111] Makoto Yokoo,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.
[112] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[113] P. J. Gmytrasiewicz,et al. A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.
[114] Yu-Chi Ho,et al. Team decision theory and information structures , 1980 .
[115] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[116] Gerhard Weiss,et al. Multiagent Systems , 1999 .
[117] Claudia V. Goldman,et al. Learning to communicate in a decentralized environment , 2007, Autonomous Agents and Multi-Agent Systems.
[118] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .
[119] Shlomo Zilberstein,et al. Constraint-based dynamic programming for decentralized POMDPs with structured interactions , 2009, AAMAS.