论文信息 - Reinforcement Learning in Artificial Intelligence - 字舞流文

Reinforcement Learning in Artificial Intelligence

R. Sutton | A. Barto

[1] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[2] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[3] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[4] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[5] Andrew G. Barto,et al. Reinforcement learning control , 1994, Current Opinion in Neurobiology.

[6] Jude W. Shavlik,et al. Incorporating Advice into Agents that Learn from Reinforcements , 1994, AAAI.

[7] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[8] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[9] Roderic A. Grupen,et al. Robust Reinforcement Learning in Motion Planning , 1993, NIPS.

[10] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .

[11] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.

[12] Sebastian Thrun,et al. Explanation-Based Neural Network Learning for Robot Control , 1992, NIPS.

[13] Paul E. Utgoff,et al. A Teaching Method for Reinforcement Learning , 1992, ML.

[14] Satinder P. Singh,et al. The Efficient Learning of Multiple Task Sequences , 1991, NIPS.

[15] Sridhar Mahadevan,et al. Scaling Reinforcement Learning to Robotics by Exploiting the Subsumption Architecture , 1991, ML.

[16] Satinder P. Singh,et al. Transfer of Learning Across Compositions of Sequentail Tasks , 1991, ML.

[17] Richard S. Sutton,et al. Planning by Incremental Dynamic Programming , 1991, ML.

[18] Rodney A. Brooks,et al. Learning to Coordinate Behaviors , 1990, AAAI.

[19] Dana H. Ballard,et al. Active Perception and Reinforcement Learning , 1990, Neural Computation.

[20] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[21] C. Watkins. Learning from delayed rewards , 1989 .

[22] David E. Goldberg,et al. Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[23] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[24] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[25] Judea Pearl,et al. Heuristics : intelligent search strategies for computer problem solving , 1984 .

[26] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[27] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[28] A. V. Napalkov,et al. COMPUTERS AND THOUGHT, EDITED BY E. A. FEIGENBAUM AND J. FELDMAN, NEW YORK, MCGRAW-HILL, 1963: BOOK REVIEW, , 1967 .

[29] Marvin Minsky,et al. Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[30] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..