Robot Learning From Demonstration

The goal of robot learning from demonstration is to have a robot learn from watching a demonstration of the task to be performed. In our approach to learning from demonstration the robot learns a reward function from the demonstration and a task model from repeated attempts to perform the task. A policy is computed based on the learned reward function and task model. Lessons learned from an implementation on an anthropomorphic robot arm using a pendulum swing up task include 1) simply mimicking demonstrated motions is not adequate to perform this task, 2) a task planner can use a learned model and reward function to compute an appropriate policy, 3) this modelbased planning process supports rapid learning, 4) both parametric and nonparametric models can be learned and used, and 5) incorporating a task level direct learning component, which is non-model-based, in addition to the model-based planner, is useful in compensating for structural modeling errors and slow model learning.

[1]  P. Ribeaux,et al.  Learning and Skill Acquisition , 1978 .

[2]  Christopher G. Atkeson,et al.  Task-level robot learning: juggling a tennis ball more accurately , 1989, Proceedings, 1989 International Conference on Robotics and Automation.

[3]  D. Pomerleau Eecient T Raining of Artiicial Neural Networks for Autonomous Navigation , 1991 .

[4]  Michael F. Cohen,et al.  Interactive spacetime control for animation , 1992, SIGGRAPH.

[5]  Mitsuo Kawato,et al.  Teaching by Showing in Kendama Based on Optimization Principle , 1994 .

[6]  Mark W. Spong,et al.  The swing up control problem for the Acrobot , 1995 .

[7]  Yangsheng Xu,et al.  Human skill transfer: neural networks as learners and teachers , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[8]  Stefan Schaal,et al.  From Isolation to Cooperation: An Alternative View of a System of Experts , 1995, NIPS.

[9]  Avinash C. Kak,et al.  Automatic learning of assembly tasks using a DataGlove system , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[10]  Nathan Delson,et al.  Robot programming by human demonstration: adaptation and inconsistency in constrained motion , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[11]  S. Schaal,et al.  A Kendama Learning Robot Based on Bi-directional Theory , 1996, Neural Networks.

[12]  Gregory Z. Grudic,et al.  Human-to-robot skill transfer using the SPORE approximation , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[13]  Peter Bakker,et al.  Robot see, robot do: An overview of robot imitation , 1996 .

[14]  Takashi Suehiro,et al.  Designing Skills with Visual Feedback for APO , 1996 .

[15]  Ales Ude,et al.  Integration of Symbolic and Subsymbolic Learning to Support Robot Programming by Human Demonstration , 1996 .

[16]  Stefan Schaal,et al.  Learning tasks from a single demonstration , 1997, Proceedings of International Conference on Robotics and Automation.