Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers
暂无分享,去创建一个
[1] R. Bellman. A Markovian Decision Process , 1957 .
[2] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[3] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[4] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[5] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[6] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[7] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[8] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[9] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[10] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[11] Shlomo Zilberstein,et al. Using Anytime Algorithms in Intelligent Systems , 1996, AI Mag..
[12] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[13] Milos Hauskrecht,et al. Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes , 1997, AAAI/IAAI.
[14] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[15] Hector Geffner,et al. Solving Large POMDPs using Real Time Dynamic Programming , 1998 .
[16] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[17] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[18] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[19] Milos Hauskrecht,et al. Planning treatment of ischemic heart disease with partially observable Markov decision processes , 2000, Artif. Intell. Medicine.
[20] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[21] Weihong Zhang,et al. Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes , 2011, J. Artif. Intell. Res..
[22] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[23] Kin Man Poon,et al. A fast heuristic algorithm for decision-theoretic planning , 2001 .
[24] Craig Boutilier,et al. A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.
[25] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..
[26] Joelle Pineau,et al. Applying Metric-Trees to Belief-Point POMDPs , 2003, NIPS.
[27] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[28] Blai Bonet,et al. Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.
[29] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[30] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[31] Michael L. Littman,et al. An Instance-Based State Representation for Network Repair , 2004, AAAI.
[32] Nikos A. Vlassis,et al. A point-based POMDP algorithm for robot planning , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[33] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[34] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..
[35] Doina Precup,et al. Using core beliefs for point-based value iteration , 2005, IJCAI.
[36] Joelle Pineau,et al. POMDP Planning for Robust Robot Control , 2005, ISRR.
[37] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[38] P. Poupart. Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .
[39] Reid G. Simmons,et al. Point-Based POMDP Algorithms: Improved Analysis and Implementation , 2005, UAI.
[40] Kevin D. Seppi,et al. Prioritization Methods for Accelerating MDP Solvers , 2005, J. Mach. Learn. Res..
[41] Doina Precup,et al. Belief Selection in Point-Based Planning Algorithms for POMDPs , 2006, Canadian Conference on AI.
[42] Pascal Poupart,et al. Point-Based Value Iteration for Continuous POMDPs , 2006, J. Mach. Learn. Res..
[43] Joelle Pineau,et al. Anytime Point-Based Approximations for Large POMDPs , 2006, J. Artif. Intell. Res..
[44] Eric A. Hansen,et al. Indefinite-Horizon POMDPs with Action-Based Termination , 2007, AAAI.
[45] Timothy J. Ross,et al. Alexander Gegov, Complexity Management in Fuzzy Systems, 2007, 368 pp. Hardcover: Studies in Fuzziness and Soft Computing, Volume 211, ISBN-13 978-3-540-38883-8 , 2007, Artif. Intell..
[46] Peng Dai,et al. Topological Value Iteration Algorithm for Markov Decision Processes , 2007, IJCAI.
[47] Guy Shani,et al. Forward Search Value Iteration for POMDPs , 2007, IJCAI.
[48] Guy Shani,et al. Scaling Up: Solving POMDPs through Value Based Clustering , 2007, AAAI.
[49] Brahim Chaib-draa,et al. AEMS: An Anytime Online Search Algorithm for Approximate Policy Refinement in Large POMDPs , 2007, IJCAI.
[50] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..
[51] Leslie Pack Kaelbling,et al. Grasping POMDPs , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[52] Hui Li,et al. Point-Based Policy Iteration , 2007, AAAI.
[53] Kee-Eung Kim,et al. Symbolic Heuristic Search Value Iteration for Factored POMDPs , 2008, AAAI.
[54] Guy Shani,et al. Efficient ADD Operations for Point-Based Algorithms , 2008, ICAPS.
[55] Guy Shani,et al. Prioritizing Point-Based POMDP Solvers , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[56] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.
[57] Nicholas Roy,et al. The permutable POMDP: fast solutions to POMDPs for preference elicitation , 2008, AAMAS.
[58] N. Armstrong-Crews. Solving POMDPs from Both Sides : Growing Dual Parsimonious Bounds , 2008 .
[59] Leslie Pack Kaelbling,et al. Continuous-State POMDPs with Hybrid Dynamics , 2008, ISAIM.
[60] Guy Shani,et al. Topological Order Planner for POMDPs , 2009, International Joint Conference on Artificial Intelligence.
[61] Guy Shani,et al. Improving Existing Fault Recovery Policies , 2009, NIPS.
[62] Joelle Pineau,et al. Development and Validation of a Robust Speech Interface for Improved Human-Robot Interaction , 2009, Int. J. Soc. Robotics.
[63] Blai Bonet,et al. Solving POMDPs: RTDP-Bel vs. Point-based Algorithms , 2009, IJCAI.
[64] Hector Geffner,et al. A Translation-Based Approach to Contingent Planning , 2009, IJCAI.
[65] Nicholas Roy,et al. icLQG: Combining local and global optimization for control in information space , 2009, 2009 IEEE International Conference on Robotics and Automation.
[66] Guy Shani. Evaluating Point-Based POMDP Solvers on Multicore Machines , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[67] Scott Sanner,et al. Symbolic Dynamic Programming for First-order POMDPs , 2010, AAAI.
[68] Jesse Hoey,et al. Automated handwashing assistance for persons with dementia using video and a partially observable Markov decision process , 2010, Comput. Vis. Image Underst..
[69] Roni Khardon,et al. Relational Partially Observable MDPs , 2010, AAAI.
[70] Kee-Eung Kim,et al. Closing the Gap: Improved Bounds on Optimal POMDP Solutions , 2011, ICAPS.
[71] Andreas Krause,et al. Advances in Neural Information Processing Systems (NIPS) , 2014 .