Partially Observable Markov Decision Processes
暂无分享,去创建一个
[1] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[2] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[3] Panos E. Trahanias,et al. Real-time hierarchical POMDPs for autonomous robot navigation , 2007, Robotics Auton. Syst..
[4] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[5] G. Monahan. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .
[6] Jesse Hoey,et al. Solving POMDPs with Continuous or Large Discrete Observation Spaces , 2005, IJCAI.
[7] Craig Boutilier,et al. Stochastic Local Search for POMDP Controllers , 2004, AAAI.
[8] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[9] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[10] Shlomo Zilberstein,et al. Finite-memory control of partially observable systems , 1998 .
[11] Nikos A. Vlassis,et al. Robot Planning in Partially Observable Continuous Domains , 2005, BNAIC.
[12] Alex Pentland,et al. Active gesture recognition using partially observable Markov decision processes , 1996, Proceedings of 13th International Conference on Pattern Recognition.
[13] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.
[14] Steven L. Shafer,et al. Comparison of Some Suboptimal Control Policies in Medical Drug Therapy , 1996, Oper. Res..
[15] Joelle Pineau,et al. An integrated approach to hierarchy and abstraction for pomdps , 2002 .
[16] W. Burgard,et al. Markov Localization for Mobile Robots in Dynamic Environments , 1999, J. Artif. Intell. Res..
[17] Joelle Pineau,et al. Active Learning in Partially Observable Markov Decision Processes , 2005, ECML.
[18] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[19] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[20] Yossi Aviv,et al. A Partially Observed Markov Decision Process for Dynamic Pricing , 2005, Manag. Sci..
[21] Chelsea C. White,et al. A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..
[22] Nicholas Roy,et al. The permutable POMDP: fast solutions to POMDPs for preference elicitation , 2008, AAMAS.
[23] Zhengzhu Feng,et al. Dynamic Programming for POMDPs Using a Factored State Representation , 2000, AIPS.
[24] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[25] Blai Bonet,et al. An epsilon-Optimal Grid-Based Algorithm for Partially Observable Markov Decision Processes , 2002, ICML.
[26] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[27] Pascal Poupart,et al. Point-Based Value Iteration for Continuous POMDPs , 2006, J. Mach. Learn. Res..
[28] Anne Condon,et al. On the undecidability of probabilistic planning and related stochastic optimization problems , 2003, Artif. Intell..
[29] Shlomo Zilberstein,et al. Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.
[30] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[31] Nicholas Roy,et al. Exponential Family PCA for Belief Compression in POMDPs , 2002, NIPS.
[32] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.
[33] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[34] E. Dynkin. Controlled Random Sequences , 1965 .
[35] Hsien-Te Cheng,et al. Algorithms for partially observable markov decision processes , 1989 .
[36] Sebastian Thrun,et al. Probabilistic robotics , 2002, CACM.
[37] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[38] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.
[39] Eric A. Hansen,et al. An Improved Grid-Based Approximation Algorithm for POMDPs , 2001, IJCAI.
[40] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[41] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..
[42] Milos Hauskrecht,et al. Planning treatment of ischemic heart disease with partially observable Markov decision processes , 2000, Artif. Intell. Medicine.
[43] Ross B. Corotis,et al. INSPECTION, MAINTENANCE, AND REPAIR WITH PARTIAL OBSERVABILITY , 1995 .
[44] Marco Wiering,et al. Utile distinction hidden Markov models , 2004, ICML.
[45] Joelle Pineau,et al. Spoken Dialog Management for Robots , 2000, ACL 2000.
[46] A. Yezzi,et al. Local or Global Minima: Flexible Dual-Front Active Contours , 2007 .
[47] Joelle Pineau,et al. Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..
[48] S. Nanda. Mathematical Analysis and Applications , 2004 .
[49] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..
[50] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[51] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[52] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[53] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[54] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[55] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[56] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[57] Jan Peters. Policy gradient methods , 2010, Scholarpedia.
[58] Marc Toussaint,et al. Model-free reinforcement learning as mixture learning , 2009, ICML '09.
[59] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[60] Scott Sanner,et al. Symbolic Dynamic Programming for First-order POMDPs , 2010, AAAI.
[61] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[62] Alvin W Drake,et al. Observation of a Markov process through a noisy channel , 1962 .
[63] Joelle Pineau,et al. Towards robotic assistants in nursing homes: Challenges and results , 2003, Robotics Auton. Syst..
[64] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[65] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[66] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[67] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[68] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[69] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[70] Nikos A. Vlassis,et al. A point-based POMDP algorithm for robot planning , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[71] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[72] C. R. Sox,et al. Adaptive Inventory Control for Nonstationary Demand and Partial Information , 2002, Manag. Sci..
[73] Jesse Hoey,et al. A Decision-Theoretic Approach to Task Assistance for Persons with Dementia , 2005, IJCAI.
[74] Anthony R. Cassandra,et al. Development and Evaluation of a Bayesian Low-Vision Navigation Aid , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.
[75] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[76] Pedro U. Lima,et al. Active cooperative perception in network robot systems using POMDPs , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[77] Wenju Liu,et al. Planning in Stochastic Domains: Problem Characteristics and Approximation , 1996 .
[78] Chelsea C. White,et al. A Hybrid Genetic/Optimization Algorithm for Finite-Horizon, Partially Observed Markov Decision Processes , 2004, INFORMS J. Comput..
[79] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[80] Roni Khardon,et al. Relational Partially Observable MDPs , 2010, AAAI.
[81] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning , 1995 .
[82] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[83] Jonathan Baxter,et al. Scaling Internal-State Policy-Gradient Methods for POMDPs , 2002 .
[84] Deb Roy,et al. Connecting language to the world , 2005, Artif. Intell..
[85] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[86] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[87] Guy Shani,et al. Resolving Perceptual Aliasing In The Presence Of Noisy Sensors , 2004, NIPS.
[88] Robert G. Haight,et al. Optimal control of an invasive species with imperfect information about the level of infestation , 2010 .
[89] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.
[90] Joelle Pineau,et al. Bayes-Adaptive POMDPs , 2007, NIPS.
[91] Milind Tambe,et al. Exploiting belief bounds: practical POMDPs for personal assistant agents , 2005, AAMAS '05.
[92] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .
[93] Jesse Hoey,et al. Value-Directed Human Behavior Analysis from Video Using Partially Observable Markov Decision Processes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[94] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[95] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .
[96] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[97] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[98] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[99] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[100] Benjamin Van Roy,et al. A Tractable POMDP for a Class of Sequencing Problems , 2001, UAI 2001.
[101] Reid G. Simmons,et al. Point-Based POMDP Algorithms: Improved Analysis and Implementation , 2005, UAI.
[102] Nikos A. Vlassis,et al. Planning with Continuous Actions in Partially Observable Environments , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.
[103] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[104] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[105] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.
[106] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[107] Satinder P. Singh,et al. Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes , 1998, NIPS.
[108] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[109] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[110] J. Satia,et al. Markovian Decision Processes with Probabilistic Observation of States , 1973 .
[111] Guy Shani,et al. Forward Search Value Iteration for POMDPs , 2007, IJCAI.
[112] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.
[113] Leslie Pack Kaelbling,et al. Acting under uncertainty: discrete Bayesian models for mobile-robot navigation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.
[114] Ronen I. Brafman,et al. A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.
[115] Leslie Pack Kaelbling,et al. Continuous-State POMDPs with Hybrid Dynamics , 2008, ISAIM.
[116] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[117] Richard Dearden,et al. Planning to see: A hierarchical approach to planning visual actions on a robot using POMDPs , 2010, Artif. Intell..
[118] Reid G. Simmons,et al. Unsupervised learning of probabilistic models for robot navigation , 1996, Proceedings of IEEE International Conference on Robotics and Automation.
[119] Pascal Poupart,et al. Model-based Bayesian Reinforcement Learning in Partially Observable Domains , 2008, ISAIM.
[120] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[121] J SondikEdward. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon , 1978 .
[122] Guy Shani,et al. Model-Based Online Learning of POMDPs , 2005, ECML.
[123] R. L. Stratonovich. CONDITIONAL MARKOV PROCESSES , 1960 .
[124] Shlomo Zilberstein,et al. Region-Based Incremental Pruning for POMDPs , 2004, UAI.
[125] Reid G. Simmons,et al. Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.
[126] Kin Man Poon,et al. A fast heuristic algorithm for decision-theoretic planning , 2001 .
[127] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[128] Sebastian Thrun,et al. Coastal Navigation with Mobile Robots , 1999, NIPS.
[129] Sridhar Mahadevan,et al. Approximate planning with hierarchical partially observable Markov decision process models for robot navigation , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[130] Guy Shani,et al. Efficient ADD Operations for Point-Based Algorithms , 2008, ICAPS.
[131] Leslie Pack Kaelbling,et al. Grasping POMDPs , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.