Reinforcement Learning in Artificial Intelligence

[1]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[2]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[3]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[4]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[5]  Andrew G. Barto,et al.  Reinforcement learning control , 1994, Current Opinion in Neurobiology.

[6]  Jude W. Shavlik,et al.  Incorporating Advice into Agents that Learn from Reinforcements , 1994, AAAI.

[7]  Gerald Tesauro,et al.  TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[8]  Michael I. Jordan,et al.  Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[9]  Roderic A. Grupen,et al.  Robust Reinforcement Learning in Motion Planning , 1993, NIPS.

[10]  Leslie Pack Kaelbling,et al.  Learning in embedded systems , 1993 .

[11]  Geoffrey E. Hinton,et al.  Feudal Reinforcement Learning , 1992, NIPS.

[12]  Sebastian Thrun,et al.  Explanation-Based Neural Network Learning for Robot Control , 1992, NIPS.

[13]  Paul E. Utgoff,et al.  A Teaching Method for Reinforcement Learning , 1992, ML.

[14]  Satinder P. Singh,et al.  The Efficient Learning of Multiple Task Sequences , 1991, NIPS.

[15]  Sridhar Mahadevan,et al.  Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML.

[16]  Satinder P. Singh,et al.  Transfer of Learning Across Compositions of Sequentail Tasks , 1991, ML.

[17]  Richard S. Sutton,et al.  Planning by Incremental Dynamic Programming , 1991, ML.

[18]  Rodney A. Brooks,et al.  Learning to Coordinate Behaviors , 1990, AAAI.

[19]  Dana H. Ballard,et al.  Active Perception and Reinforcement Learning , 1990, Neural Computation.

[20]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[21]  C. Watkins Learning from delayed rewards , 1989 .

[22]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[23]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[24]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[25]  Judea Pearl,et al.  Heuristics : intelligent search strategies for computer problem solving , 1984 .

[26]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[27]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[28]  A. V. Napalkov,et al.  COMPUTERS AND THOUGHT, EDITED BY E. A. FEIGENBAUM AND J. FELDMAN, NEW YORK, MCGRAW-HILL, 1963: BOOK REVIEW, , 1967 .

[29]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[30]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..