A Tutorial on Reinforcement Learning Techniques
暂无分享,去创建一个
[1] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .
[2] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[3] Claude Sammut,et al. Controlling a steel mill with BOXES , 1996, Machine Intelligence 14.
[4] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[5] Ian H. Witten,et al. An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..
[6] R. Andrew McCallum,et al. Hidden state and reinforcement learning with instance-based state identification , 1996, IEEE Trans. Syst. Man Cybern. Part B.
[7] P. B. Coaker,et al. Applied Dynamic Programming , 1964 .
[8] Hilbert J. Kappen,et al. LEARNING ACTIVE VISION , 1998 .
[9] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[10] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[11] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[12] José del R. Millán,et al. Rapid, safe, and incremental learning of navigation strategies , 1996, IEEE Trans. Syst. Man Cybern. Part B.
[13] Maja J. Matarić. A Comparative Analysis of Reinforcement Learning Methods , 1991 .
[14] Marco Saerens,et al. A neural controller , 1989 .
[15] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[16] Katsuhiko Ogata. Designing Linear Control Systems with MATLAB , 1993 .
[17] Csaba Szepesvari. Static and Dynamic Aspects of Optimal Sequential Decision Making , 1998 .
[18] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[19] Bernard Widrow,et al. Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..
[20] John H. Holland,et al. Cognitive systems based on adaptive algorithms , 1977, SGAR.
[21] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[22] R. A. McCallum. First Results with Utile Distinction Memory for Reinforcement Learning , 1992 .
[23] Rodney A. Brooks,et al. Real Robots, Real Learning Problems , 1993 .
[24] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..
[25] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[26] C. Striebel. Sufficient statistics in the optimum control of stochastic systems , 1965 .
[27] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[28] Andrew G. Barto,et al. Large-scale dynamic optimization using teams of reinforcement learning agents , 1996 .
[29] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[30] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[31] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[32] Maja J. Matarić,et al. A Distributed Model for Mobile Robot Environment-Learning and Navigation , 1990 .
[33] Dimitri P. Bertsekas,et al. A Counterexample to Temporal Differences Learning , 1995, Neural Computation.
[34] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[35] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .
[36] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[37] W. T. Miller,et al. CMAC: an associative neural network alternative to backpropagation , 1990, Proc. IEEE.
[38] Dana H. Ballard,et al. Active Perception and Reinforcement Learning , 1990, Neural Computation.
[39] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[40] Satinder Singh,et al. Learning to Solve Markovian Decision Processes , 1993 .
[41] M. Gabriel,et al. Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .
[42] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[43] Long Ji Lin,et al. Reinforcement Learning of Non-Markov Decision Processes , 1995, Artif. Intell..
[44] Erann Gat,et al. Behavior control for robotic exploration of planetary surfaces , 1994, IEEE Trans. Robotics Autom..
[45] Carlos H. C. Ribeiro,et al. Embedding a Priori Knowledge in Reinforcement Learning , 1998, J. Intell. Robotic Syst..
[46] Claude Sammut,et al. Recent progress with BOXES , 1994, Machine Intelligence 13.
[47] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .
[48] Rodney A. Brooks,et al. Elephants don't play chess , 1990, Robotics Auton. Syst..
[49] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[50] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[51] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[52] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[53] Csaba Szepesv Ari,et al. Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms , 1996 .