Eecient Exploration in Reinforcement Learning
暂无分享,去创建一个
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[3] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .
[4] Ronald L. Rivest,et al. Diversity-based inference of finite automata , 1994, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[5] B. Widrow,et al. The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[6] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[7] Michael I. Jordan,et al. Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.
[8] Michael C. Mozer,et al. Discovering the Structure of a Reactive Environment by Exploration , 1990, Neural Computation.
[9] Sebastian Thrun,et al. Planning with an Adaptive World Model , 1990, NIPS.
[10] Bartlett W. Mel,et al. Murphy: A neurally-inspired connectionist approach to learning and performance in vision-based robot motion planning , 1990 .
[11] Donald A. Sofge,et al. Neural network based process optimization and control , 1990, 29th IEEE Conference on Decision and Control.
[12] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[13] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[14] Andrew G. Barto,et al. On the Computational Economics of Reinforcement Learning , 1991 .
[15] J. Urgen Schmidhuber,et al. Adaptive confidence and adaptive curiosity , 1991, Forschungsberichte, TU Munich.
[16] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[17] Sridhar Mahadevan,et al. Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML.
[18] Steven D. Whitehead,et al. Complexity and Cooperation in Q-Learning , 1991, ML.
[19] Sebastian Thrun,et al. On Planning And Exploration In Non-Discrete Environments , 1991 .
[20] A. W. Moore. An Intoductory Tutorial on Kd-trees Extract from Andrew Moore's Phd Thesis: Eecient Memory-based L Earning for Robot Control , 1991 .
[21] Sebastian Thrun,et al. Active Exploration in Dynamic Environments , 1991, NIPS.
[22] Andrew W. Moore,et al. Knowledge of knowledge and intelligent experimentation for learning control , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[23] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .
[24] Long Ji Lin,et al. Self-improvement Based on Reinforcement Learning, Planning and Teaching , 1991, ML.
[25] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[26] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .