Complementary humanoid behavior shaping using corrective demonstration

A humanoid robot can perform a task through a policy mapping from its sensed state to the appropriate task actions. We assume that a hand-coded controller can capture such a mapping only for the basic cases of the given task. As the complexity of the situation increases, the harder it becomes to refine the controller, and such refinements are often tedious and error prone. Based on the fact that a human can detect the failures of a robot executing the hand-coded controller, in this paper we present a corrective learning from demonstration approach to improve the robot performance. Corrections are captured as new state action pairs, and during the autonomous humanoid robot execution, the controller is replaced by the demonstration corrections when the new state is found to be similar to the corrected state. We focus on the Aldebaran Nao humanoid robot and a concrete complex ball dribbling task in an environment with obstacles. We present experimental results showing an improvement in the humanoid task performance when the corrective demonstration is used in addition to the basic hand-coded controller.

[1]  VelosoManuela,et al.  A survey of robot learning from demonstration , 2009 .

[2]  Brett Browning,et al.  Learning by demonstration with critique from a human teacher , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[3]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[4]  Manuela M. Veloso,et al.  Teaching collaborative multi-robot tasks through demonstration , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[5]  Manuela M. Veloso,et al.  Biped Walk Learning Through Playback and Corrective Demonstration , 2010, AAAI.

[6]  Brett Browning,et al.  Learning robot motion control with demonstration and advice-operators , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Jun Morimoto,et al.  Learning from demonstration and adaptation of biped locomotion , 2004, Robotics Auton. Syst..

[8]  Jan Hoffmann,et al.  A Vision Based System for Goal-Directed Obstacle Avoidance , 2004, RoboCup.

[9]  Sonia Chernova,et al.  Confidence-Based Demonstration Selection for Interactive Robot Learning , 2007 .

[10]  Daniel H. Grollman,et al.  Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[11]  Manuela M. Veloso,et al.  Improving Biped Walk Stability Using Real-Time Corrective Human Feedback , 2010, RoboCup.

[12]  Manuela M. Veloso,et al.  Visual sonar: fast obstacle avoidance using monocular vision , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).