Between MOPs and Semi-MOP: Learning, Planning & Representing Knowledge at Multiple Temporal Scales
暂无分享,去创建一个
[1] John R. Koza,et al. Automatic Programming of Robots Using Genetic Programming , 1992, AAAI.
[2] Leslie Pack Kaelbling,et al. Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..
[3] R. Korf. Learning to solve problems by searching for macro-operators , 1983 .
[4] Satinder Singh. Transfer of Learning by Composing Solutions of Elemental Sequential Tasks , 1992, Mach. Learn..
[5] Mark B. Ring. Incremental Development of Complex Behaviors , 1991, ML.
[6] Reid G. Simmons,et al. Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.
[7] Roderic A. Grupen,et al. A feedback control structure for on-line learning tasks , 1997, Robotics Auton. Syst..
[8] Pattie Maes,et al. A bottom-up mechanism for behavior selection in an artificial creature , 1991 .
[9] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..
[10] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[11] John N. Tsitsiklis,et al. Reinforcement Learning for Call Admission Control and Routing in Integrated Service Networks , 1997, NIPS.
[12] Maja J. Mataric,et al. Behaviour-based control: examples from navigation, learning, and group behaviour , 1997, J. Exp. Theor. Artif. Intell..
[13] Selahattin Kuru,et al. Qualitative System Identification: Deriving Structure from Behavior , 1996, Artif. Intell..
[14] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[15] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[16] Benjamin Kuipers,et al. Common-Sense Knowledge of Space: Learning from Experience , 1979, IJCAI.
[17] Paul R. Cohen,et al. Concepts From Time Series , 1998, AAAI/IAAI.
[18] Allen Newell,et al. Human Problem Solving. , 1973 .
[19] Chris Drummond,et al. Composing Functions to Speed up Reinforcement Learning in a Changing World , 1998, ECML.
[20] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[21] Lambert E. Wixson,et al. Scaling Reinforcement Learning Techniques via Modularity , 1991, ML.
[22] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[23] Marco Colombetti,et al. Behavior analysis and training-a methodology for behavior engineering , 1996, IEEE Trans. Syst. Man Cybern. Part B.
[24] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[25] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[26] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.
[27] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[28] Johan de Kleer,et al. A Qualitative Physics Based on Confluences , 1984, Artif. Intell..
[29] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[30] Roger W. Brockett,et al. Hybrid Models for Motion Control Systems , 1993 .
[31] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[32] Robert L. Grossman,et al. Timed Automata , 1999, CAV.
[33] Thomas Dean,et al. Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.
[34] Russell Greiner,et al. A Statistical Approach to Solving the EBL Utility Problem , 1992, AAAI.
[35] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[36] R. Sutton,et al. Macro-Actions in Reinforcement Learning: An Empirical Analysis , 1998 .
[37] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[38] Gerald DeJong,et al. Learning to Plan in Continuous Domains , 1994, Artif. Intell..
[39] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.
[40] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[41] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..
[42] Richard S. Sutton,et al. Roles of Macro-Actions in Accelerating Reinforcement Learning , 1998 .
[43] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[44] Nils J. Nilsson,et al. Teleo-Reactive Programs for Agent Control , 1993, J. Artif. Intell. Res..
[45] Gary L. Drescher,et al. Made-up minds - a constructivist approach to artificial intelligence , 1991 .
[46] Eric A. Hansen,et al. Cost-Effective Sensing during Plan Execution , 1994, AAAI.
[47] Earl D. Sacerdoti,et al. Planning in a Hierarchy of Abstraction Spaces , 1974, IJCAI.
[48] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[49] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[50] S. Haykin,et al. A Q-learning-based dynamic channel assignment technique for mobile communication systems , 1999 .
[51] Minoru Asada,et al. Behavior coordination for a mobile robot using modular reinforcement learning , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.
[52] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[53] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[54] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[55] Satinder P. Singh,et al. The Efficient Learning of Multiple Task Sequences , 1991, NIPS.
[56] Nils J. Nilsson,et al. A Hierarchical Robot Planning and Execution System. , 1973 .
[57] L. Chrisman. Reasoning About Probabilistic Actions At Multiple Levels of Granularity , 1994 .
[58] David Ruby,et al. Learning Episodes for Optimization , 1992, ML.
[59] Ronen I. Brafman,et al. Modeling Agents as Qualitative Decision Makers , 1997, Artif. Intell..
[60] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[61] Jonas Karlsson,et al. Learning via task decomposition , 1993 .
[62] Rodney A. Brooks,et al. Learning to Coordinate Behaviors , 1990, AAAI.
[63] Karen Zita Haigh,et al. Exploiting domain geometry in analogical route planning , 1997, J. Exp. Theor. Artif. Intell..
[64] Roderic A. Grupen,et al. Robust Reinforcement Learning in Motion Planning , 1993, NIPS.
[65] Richard E. Korf,et al. Planning as Search: A Quantitative Approach , 1987, Artif. Intell..
[66] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.