Self-improving reactive agents based on reinforcement learning, planning and teaching
暂无分享,去创建一个
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] Tom M. Mitchell,et al. Generalization as Search , 2002 .
[3] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[4] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[5] D. Rumelhart. Learning Internal Representations by Error Propagation, Parallel Distributed Processing , 1986 .
[6] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[7] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .
[8] S. Grossberg,et al. RAMBaf: A Connectionist Expert System That Learns by Example , 1987 .
[9] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[10] D. Ballard,et al. A Role for Anticipation in Reactive Systems that Learn , 1989, ML.
[11] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[12] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.
[13] Kevin J. Lang. A time delay neural network architecture for speech recognition , 1989 .
[14] Geoffrey E. Hinton,et al. Distributed Representations , 1986, The Philosophy of Artificial Intelligence.
[15] Sebastian Thrun,et al. Planning with an Adaptive World Model , 1990, NIPS.
[16] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[17] Ming Tan,et al. Learning a Cost-Sensitive Internal Representation for Reinforcement Learning , 1991, ML.
[18] A. Moore. Variable Resolution Dynamic Programming , 1991, ML.
[19] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[20] Sridhar Mahadevan,et al. Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML.
[21] Steven D. Whitehead,et al. Complexity and Cooperation in Q-Learning , 1991, ML.
[22] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[23] Sebastian Thrun,et al. Active Exploration in Dynamic Environments , 1991, NIPS.
[24] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .
[25] Long Ji Lin,et al. Self-improvement Based on Reinforcement Learning, Planning and Teaching , 1991, ML.
[26] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[27] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[28] Peter Dayan,et al. The convergence of TD(λ) for general λ , 1992, Machine Learning.
[29] John J. Grefenstette,et al. Learning sequential decision rules using simulation models and competition , 2004, Machine Learning.
[30] Dana H. Ballard,et al. Learning to perceive and act by trial and error , 1991, Machine Learning.
[31] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.