The National Science Foundation Workshop on Reinforcement Learning

Reinforcement learning has become one of the most actively studied learning frameworks in the area of intelligent autonomous agents. This article describes the results of a three-day meeting of leading researchers in this area that was sponsored by the National Science Foundation. Because reinforcement learning is an interdisciplinary topic, the workshop brought together researchers from a variety of fields, including machine learning, neural networks, AI, robotics, and operations research. Thirty leading researchers from the United States, Canada, Europe, and Japan, representing from many different universities, government, and industrial research laboratories participated in the workshop. The goals of the meeting were to (1) understand limitations of current reinforcement-learning systems and define promising directions for further research; (2) clarify the relationships between reinforcement learning and existing work in engineering fields, such as operations research; and (3) identify potential industrial applications of reinforcement learning.

[1]  Minoru Asada,et al.  Reasonable performance in less learning time by real robot based on incremental state space segmentation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[2]  Prasad Tadepalli,et al.  Auto-Exploratory Average Reward Reinforcement Learning , 1996, AAAI/IAAI, Vol. 1.

[3]  Stuart J. Russell,et al.  Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.

[4]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[5]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[6]  Stuart J. Russell,et al.  Local Learning in Probabilistic Networks with Hidden Variables , 1995, IJCAI.

[7]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[8]  Satinder Singh,et al.  Learning to Solve Markovian Decision Processes , 1993 .

[9]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[10]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[11]  Prasad Tadepalli,et al.  Scaling Up Average Reward Reinforcement Learning by Approximating the Domain Models and the Value Function , 1996, ICML.

[12]  Minoru Asada,et al.  Behavior coordination for a mobile robot using modular reinforcement learning , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[13]  Sebastian Thrun,et al.  Finding Structure in Reinforcement Learning , 1994, NIPS.

[14]  A. F. Veinott Discrete Dynamic Programming with Sensitive Discount Optimality Criteria , 1969 .

[15]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[16]  Peter Dayan,et al.  Bee foraging in uncertain environments using predictive hebbian learning , 1995, Nature.