Reinforcement learning for robot control

Writing control code for mobile robots can be a very time-consuming process. Even for apparently simple tasks, it is often difficult to specify in detail how the robot should accomplish them. Robot control code is typically full of magic numbers that must be painstakingly set for each environment that the robot must operate in. The idea of having a robot learn how to accomplish a task, rather than being told explicitly is an appealing one. It seems easier and much more intuitive for the programmer to specify what the robot should be doing, and let it learn the fine details of how to do it. In this paper, we describe JAQL, a framework for efficient learning on mobile robots, and present the results of using it to learn control policies for simple tasks.

[1]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[2]  Richard S. Sutton,et al.  Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.

[3]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[4]  Dean A. Pomerleau,et al.  Neural Network Perception for Mobile Robot Guidance , 1993 .

[5]  Longxin Lin Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.

[6]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[7]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[8]  Sridhar Mahadevan,et al.  Machine Learning for Robots A Comparison of Di erent Paradigms , 2002 .

[9]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[10]  Gillian M. Hayes,et al.  Imitative Learning Mechanisms in Robots and Humans , 1996 .

[11]  Rodney A. Brooks,et al.  Learning to Coordinate Behaviors , 1990, AAAI.

[12]  Stefan Schaal,et al.  Robot learning by nonparametric regression , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).

[13]  Geoffrey J. Gordon,et al.  Approximate solutions to markov decision processes , 1999 .

[14]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[15]  Peter Bakker,et al.  Robot see, robot do: An overview of robot imitation , 1996 .

[16]  Marco Colombetti,et al.  Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..

[17]  R. Cook Influential Observations in Linear Regression , 1979 .

[18]  Leslie Pack Kaelbling,et al.  Making Reinforcement Learning Work on Real Robots , 2002 .

[19]  Leslie Pack Kaelbling,et al.  Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.

[20]  Michael Kaiser,et al.  Transfer of Elementary Skills via Human-Robot Interaction , 1997, Adapt. Behav..

[21]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..