An integrated approach to hierarchy and abstraction for pomdps
暂无分享,去创建一个
[1] Tucker R. Balch,et al. Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning , 2001, ICML.
[2] Sebastian Thrun,et al. Coastal Navigation with Mobile Robots , 1999, NIPS.
[3] Liqiang Feng,et al. Navigating Mobile Robots: Systems and Techniques , 1996 .
[4] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[5] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.
[6] Craig Boutilier,et al. Value-Directed Belief State Approximation for POMDPs , 2000, UAI.
[7] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[8] Mark C. Torrance,et al. Natural communication with robots , 1994 .
[9] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[10] Ronald C. Arkin,et al. An Behavior-based Robotics , 1998 .
[11] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[12] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[13] Joelle Pineau,et al. Spoken Dialog Management for Robots , 2000, ACL 2000.
[14] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[15] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[16] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[17] Ingemar J. Cox,et al. Autonomous Robot Vehicles , 1990, Springer New York.
[18] Gang Wang,et al. Hierarchical Optimization of Policy-Coupled Semi-Markov Decision Processes , 1999, ICML.
[19] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[20] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[21] A. Jazwinski. Stochastic Processes and Filtering Theory , 1970 .
[22] Satinder Singh. Transfer of Learning by Composing Solutions of Elemental Sequential Tasks , 1992, Mach. Learn..
[23] Sridhar Mahadevan,et al. Learning Hierarchical Partially Observable Markov Decision Process Models for Robot Navigation , 2001 .
[24] Thomas Dean,et al. Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.
[25] Ronen I. Brafman,et al. A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.
[26] Joelle Pineau,et al. Experiences with a mobile robotic guide for the elderly , 2002, AAAI/IAAI.
[27] Martha E. Pollack,et al. A Plan-Based Personalized Cognitive Orthotic , 2002, AIPS.
[28] Sridhar Mahadevan,et al. Hierarchical Memory-Based Reinforcement Learning , 2000, NIPS.
[29] Erann Gat,et al. ESL: a language for supporting robust plan execution in embedded autonomous agents , 1997, 1997 IEEE Aerospace Conference.
[30] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[31] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[32] Thomas G. Dietterich. An Overview of MAXQ Hierarchical Reinforcement Learning , 2000, SARA.
[33] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[34] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[35] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[36] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[37] Nicholas Kushmerick,et al. An Algorithm for Probabilistic Planning , 1995, Artif. Intell..
[38] Mosur Ravishankar,et al. Efficient Algorithms for Speech Recognition. , 1996 .
[39] Malcolm R. K. Ryan. Using Abstract Models of Behaviours to Automatically Generate Reinforcement Learning Hierarchies , 2002, ICML.
[40] Craig Boutilier,et al. Stochastic dynamic programming with factored representations , 2000, Artif. Intell..
[41] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[42] David Andre,et al. Programmable Reinforcement Learning Agents , 2000, NIPS.
[43] Thomas G. Dietterich,et al. A POMDP Approximation Algorithm That Anticipates the Need to Observe , 2000, PRICAI.
[44] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[45] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[46] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[47] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.
[48] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[49] Chelsea C. White,et al. A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..
[50] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[51] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[52] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[53] Wolfram Burgard,et al. Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..
[54] Joelle Pineau,et al. Pearl: A Mobile Robotic Assistant for the Elderly , 2002 .
[55] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.
[56] Blai Bonet,et al. Planning as heuristic search , 2001, Artif. Intell..
[57] Joelle Pineau,et al. High-level robot behavior control using POMDPs , 2002 .
[58] Paul Taylor,et al. Festival Speech Synthesis System , 1998 .
[59] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[60] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[61] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.
[62] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.
[63] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .