An Introduction to Reinforcement Learning
暂无分享,去创建一个
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] R. Bellman. Dynamic programming. , 1957, Science.
[3] G. Siouris,et al. Optimum systems control , 1979, Proceedings of the IEEE.
[4] G. Monahan. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .
[5] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[6] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .
[7] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[8] W. Cleveland,et al. Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .
[9] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .
[10] David H. Ackley,et al. Generalization and Scaling in Reinforcement Learning , 1989, NIPS.
[11] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[12] Rodney A. Brooks,et al. Learning to Coordinate Behaviors , 1990, AAAI.
[13] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[14] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[15] A. Moore. Variable Resolution Dynamic Programming , 1991, ML.
[16] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[17] Sridhar Mahadevan,et al. Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML.
[18] Richard S. Sutton,et al. Reinforcement learning architectures for animats , 1991 .
[19] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[20] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[21] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[22] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[23] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..
[24] Sebastian Thrun,et al. The role of exploration in learning control , 1992 .
[25] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[26] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[27] Vijaykumar Gullapalli,et al. Reinforcement learning and its application to control , 1992 .
[28] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[29] L.-J. Lin,et al. Hierarchical learning of robot skills by reinforcement , 1993, IEEE International Conference on Neural Networks.
[30] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[31] Leslie Pack Kaelbling,et al. Planning With Deadlines in Stochastic Domains , 1993, AAAI.
[32] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..
[33] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[34] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[35] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[36] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[37] S. Schaal,et al. Robot juggling: implementation of memory-based learning , 1994, IEEE Control Systems.
[38] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine-mediated learning.