Tractable planning under uncertainty: exploiting structure
暂无分享,去创建一个
[1] Thomas G. Dietterich,et al. A POMDP Approximation Algorithm That Anticipates the Need to Observe , 2000, PRICAI.
[2] Bart Selman,et al. Planning as Satisfiability , 1992, ECAI.
[3] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[4] Sebastian Thrun,et al. Locating moving entities in indoor environments with teams of mobile robots , 2003, AAMAS '03.
[5] Earl D. Sacerdoti,et al. Planning in a Hierarchy of Abstraction Spaces , 1974, IJCAI.
[6] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.
[7] Amedeo Cesta,et al. Recent Advances in AI Planning , 1997, Lecture Notes in Computer Science.
[8] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[9] Nicholas Roy,et al. Exponential Family PCA for Belief Compression in POMDPs , 2002, NIPS.
[10] Leslie Pack Kaelbling,et al. Learning Policies with External Memory , 1999, ICML.
[11] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[12] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[13] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.
[14] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.
[15] David P. Miller,et al. Experiences with an architecture for intelligent, reactive agents , 1995, J. Exp. Theor. Artif. Intell..
[16] Blai Bonet,et al. Planning as heuristic search , 2001, Artif. Intell..
[17] Eric A. Hansen,et al. An Improved Grid-Based Approximation Algorithm for POMDPs , 2001, IJCAI.
[18] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[19] Brian Austin Tate. Using goal structure to direct search in a problem solver , 1975 .
[20] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[21] Illah R. Nourbakhsh,et al. DERVISH - An Office-Navigating Robot , 1995, AI Mag..
[22] Marc G. Slack,et al. Integrating deliberative planning in a robot architecture , 1994 .
[23] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.
[24] Mosur Ravishankar,et al. Efficient Algorithms for Speech Recognition. , 1996 .
[25] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[26] Malcolm R. K. Ryan. Using Abstract Models of Behaviours to Automatically Generate Reinforcement Learning Hierarchies , 2002, ICML.
[27] N. Zhang,et al. Algorithms for partially observable markov decision processes , 2001 .
[28] R. Simmons,et al. Probabilistic Navigation in Partially Observable Environments , 1995 .
[29] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[30] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[31] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[32] Erann Gat,et al. Integrating Planning and Reacting in a Heterogeneous Asynchronous Architecture for Controlling Real-World Mobile Robots , 1992, AAAI.
[33] Piergiorgio Bertoli,et al. Heuristic Search + Symbolic Model Checking = Efficient Conformant Planning , 2001, IJCAI.
[34] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[35] Craig Boutilier,et al. A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.
[36] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.
[37] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[38] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.
[39] Thomas Dean,et al. Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.
[40] Reid G. Simmons,et al. Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.
[41] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[42] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[43] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.
[44] Craig Boutilier,et al. Value-Directed Belief State Approximation for POMDPs , 2000, UAI.
[45] Weihong Zhang,et al. Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes , 2011, J. Artif. Intell. Res..
[46] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[47] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[48] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[49] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[50] Anthony Barrett,et al. Task-Decomposition via Plan Parsing , 1994, AAAI.
[51] Kenneth M. Dawson-Howe,et al. The application of robotics to a mobility aid for the elderly blind , 1998, Robotics Auton. Syst..
[52] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[53] Joelle Pineau,et al. Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.
[54] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[55] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[56] Sridhar Mahadevan,et al. Learning Hierarchical Partially Observable Markov Decision Process Models for Robot Navigation , 2001 .
[57] Sam Steel,et al. Integrating Planning, Execution and Monitoring , 1988, AAAI.
[58] Hector J. Levesque,et al. GOLOG: A Logic Programming Language for Dynamic Domains , 1997, J. Log. Program..
[59] Jon Louis Bentley,et al. An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.
[60] David E. Smith,et al. Conformant Graphplan , 1998, AAAI/IAAI.
[61] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[62] Maria Gini,et al. Deferred Planning and Sensor Use , 1990 .
[63] David Chapman,et al. Planning for Conjunctive Goals , 1987, Artif. Intell..
[64] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[65] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[66] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[67] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[68] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[69] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[70] Robert P. Goldman,et al. Expressive Planning and Explicit Knowledge , 1996, AIPS.
[71] Sebastian Thrun,et al. Coastal Navigation with Mobile Robots , 1999, NIPS.
[72] Joelle Pineau,et al. Experiences with a mobile robotic guide for the elderly , 2002, AAAI/IAAI.
[73] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[74] Martha E. Pollack,et al. A Plan-Based Personalized Cognitive Orthotic , 2002, AIPS.
[75] Sridhar Mahadevan,et al. Hierarchical Memory-Based Reinforcement Learning , 2000, NIPS.
[76] Yoram Singer,et al. The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.
[77] David A. McAllester,et al. Systematic Nonlinear Planning , 1991, AAAI.
[78] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[79] Gregg Collins,et al. Planning for Contingencies: A Decision-based Approach , 1996, J. Artif. Intell. Res..
[80] Jeffrey K. Uhlmann,et al. Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..
[81] Sebastian Thrun,et al. Motion planning through policy search , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[82] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[83] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[84] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[85] N. Vlassis,et al. A fast point-based algorithm for POMDPs , 2004 .
[86] Joelle Pineau,et al. Towards robotic assistants in nursing homes: Challenges and results , 2003, Robotics Auton. Syst..
[87] M. F.,et al. Bibliography , 1985, Experimental Gerontology.
[88] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[89] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[90] Andrew Y. Ng,et al. Policy Search via Density Estimation , 1999, NIPS.
[91] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[92] Avrim Blum,et al. Fast Planning Through Planning Graph Analysis , 1995, IJCAI.
[93] Kevin M. Lynch,et al. Sensorless parts orienting with a one-joint manipulator , 1997, Proceedings of International Conference on Robotics and Automation.
[94] Kin Man Poon,et al. A fast heuristic algorithm for decision-theoretic planning , 2001 .
[95] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[96] M. Rosencrantz,et al. Locating Moving Entities in Dynamic Indoor Environments with Teams of Mobile Robots , 2002 .
[97] Milos Hauskrecht,et al. Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes , 1997, AAAI/IAAI.
[98] Daniel S. Weld,et al. UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.
[99] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[100] David H. D. Warren,et al. Generating Conditional Plans and Programs , 1976, AISB.
[101] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[102] Chelsea C. White,et al. A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..
[103] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[104] Blai Bonet,et al. An epsilon-Optimal Grid-Based Algorithm for Partially Observable Markov Decision Processes , 2002, ICML.
[105] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[106] Jim Blythe,et al. Planning Under Uncertainty in Dynamic Domains , 1998 .
[107] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[108] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[109] Wolfram Burgard,et al. Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..
[110] Andrew W. Moore,et al. Very Fast EM-Based Mixture Model Clustering Using Multiresolution Kd-Trees , 1998, NIPS.
[111] Joelle Pineau,et al. Pearl: A Mobile Robotic Assistant for the Elderly , 2002 .
[112] Robert P. Goldman,et al. Conditional Linear Planning , 1994, AIPS.
[113] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[114] Wenju Liu,et al. Planning in Stochastic Domains: Problem Characteristics and Approximation , 1996 .
[115] Gang Wang,et al. Hierarchical Optimization of Policy-Coupled Semi-Markov Decision Processes , 1999, ICML.
[116] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[117] Mark A. Peot,et al. Conditional nonlinear planning , 1992 .
[118] David Madigan,et al. Probabilistic Temporal Reasoning , 2005, Handbook of Temporal Reasoning in Artificial Intelligence.
[119] Ronen I. Brafman,et al. A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.
[120] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[121] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[122] Sebastian Thrun,et al. Learning low dimensional predictive representations , 2004, ICML.
[123] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[124] Andrew G. Barto,et al. Automated State Abstraction for Options using the U-Tree Algorithm , 2000, NIPS.
[125] D. Castañón. Approximate dynamic programming for sensor management , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[126] Daniel S. Weld,et al. A Probablistic Model of Action for Least-Commitment Planning with Information Gathering , 1994, UAI.
[127] Peter Norvig,et al. Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.
[128] R. E. Kalman,et al. A New Approach to Linear Filtering and Prediction Problems , 2002 .
[129] Jonathan H. Connell,et al. SSS: a hybrid architecture applied to robot navigation , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.
[130] Wolfram Burgard,et al. Monte Carlo Localization: Efficient Position Estimation for Mobile Robots , 1999, AAAI/IAAI.
[131] Paul Taylor,et al. Festival Speech Synthesis System , 1998 .
[132] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[133] Ronald C. Arkin,et al. An Behavior-based Robotics , 1998 .
[134] A. Jazwinski. Stochastic Processes and Filtering Theory , 1970 .
[135] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.
[136] Michael J. Swain,et al. Programming CHIP for the IJCAI-95 Robot Competition , 1996, AI Mag..
[137] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[138] Nicholas Kushmerick,et al. An Algorithm for Probabilistic Planning , 1995, Artif. Intell..