Perseus: Randomized Point-based Value Iteration for POMDPs
暂无分享,去创建一个
[1] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[2] E. Dynkin. Controlled Random Sequences , 1965 .
[3] M. Aoki. Optimal control of partially observable Markovian systems , 1965 .
[4] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[5] J. Satia,et al. Markovian Decision Processes with Probabilistic Observation of States , 1973 .
[6] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[7] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[8] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[9] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[10] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[11] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[12] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[13] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[14] Csaba Szepesv Ari,et al. Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms , 1996 .
[15] David Andre,et al. Generalized Prioritized Sweeping , 1997, NIPS.
[16] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[17] Ronen I. Brafman,et al. A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.
[18] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[19] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[20] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[21] Shlomo Zilberstein,et al. Finite-memory control of partially observable systems , 1998 .
[22] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[23] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[24] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[25] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.
[26] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[27] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[28] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[29] Weihong Zhang,et al. Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes , 2011, J. Artif. Intell. Res..
[30] Eric A. Hansen,et al. An Improved Grid-Based Approximation Algorithm for POMDPs , 2001, IJCAI.
[31] N. Zhang,et al. Algorithms for partially observable markov decision processes , 2001 .
[32] Ronald E. Parr,et al. Solving Factored POMDPs with Linear Value Functions , 2001 .
[33] Kin Man Poon,et al. A fast heuristic algorithm for decision-theoretic planning , 2001 .
[34] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[35] Blai Bonet,et al. An epsilon-Optimal Grid-Based Algorithm for Partially Observable Markov Decision Processes , 2002, ICML.
[36] Nicholas Roy,et al. Exponential Family PCA for Belief Compression in POMDPs , 2002, NIPS.
[37] Zaharia,et al. The Interface Specification and Implementation Internals of a Program Module for Geometric Algebra , 2002 .
[38] Jonathan Baxter,et al. Scaling Internal-State Policy-Gradient Methods for POMDPs , 2002 .
[39] Marius Dorian Zaharia. Computer Graphics from a Geometric Algebra Perspective , 2002 .
[40] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.
[41] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[42] F. Groen,et al. Fast Translation Invariant Classification of (HRR) Range Profiles in a Zero Phase Representation , 2003 .
[43] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[44] Ben Kröse,et al. Aircraft Classification from Estimated Models of Radar Scattering , 2003 .
[45] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[46] Peter Norvig,et al. Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.
[47] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[48] Jelle R. Kok,et al. The Pursuit Domain Package , 2003 .
[49] Craig Boutilier,et al. VDCBPI: an Approximate Scalable Algorithm for Large POMDPs , 2004, NIPS.
[50] N. Vlassis,et al. A fast point-based algorithm for POMDPs , 2004 .
[51] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[52] Leslie Pack Kaelbling,et al. Representing hierarchical POMDPs as DBNs for multi-scale robot localization , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[53] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[54] Craig Boutilier,et al. Stochastic Local Search for POMDP Controllers , 2004, AAAI.
[55] Jeff G. Schneider,et al. Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[56] Nikos A. Vlassis,et al. A point-based POMDP algorithm for robot planning , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[57] Claudia V. Goldman,et al. Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..
[58] Zoubin Ghahramani,et al. Propagating uncertainty in POMDP value iteration with Gaussian processes , 2004 .
[59] Amos Storkey,et al. Advances in Neural Information Processing Systems 20 , 2007 .
[60] Nikos A. Vlassis,et al. Robot Planning in Partially Observable Continuous Domains , 2005, BNAIC.
[61] Jesse Hoey,et al. A Decision-Theoretic Approach to Task Assistance for Persons with Dementia , 2005, IJCAI.
[62] Nikos A. Vlassis,et al. Planning with Continuous Actions in Partially Observable Environments , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.
[63] Jesse Hoey,et al. Solving POMDPs with Continuous or Large Discrete Observation Spaces , 2005, IJCAI.
[64] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[65] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[66] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .
[67] Brahim Chaib-draa,et al. An online POMDP algorithm for complex multiagent environments , 2005, AAMAS '05.
[68] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .