Learning the skill of archery by a humanoid robot iCub

We present an integrated approach allowing the humanoid robot iCub to learn the skill of archery. After being instructed how to hold the bow and release the arrow, the robot learns by itself to shoot the arrow in such a way that it hits the center of the target. Two learning algorithms are proposed and compared to learn the bi-manual skill: one with Expectation-Maximization based Reinforcement Learning, and one with chained vector regression called the ARCHER algorithm. Both algorithms are used to modulate and coordinate the motion of the two hands, while an inverse kinematics controller is used for the motion of the arms. The image processing part recognizes where the arrow hits the target and is based on Gaussian Mixture Models for color-based detection of the target and the arrow's tip. The approach is evaluated on a 53-DOF humanoid robot iCub.

[1]  Stefan Schaal,et al.  Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Nikolaos G. Tsagarakis,et al.  iCub: the design and realization of an open humanoid platform for cognitive and neuroscience research , 2007, Adv. Robotics.

[3]  Peter Ford Dominey,et al.  Towards a platform-independent cooperative human-robot interaction system: I. Perception , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Giulio Sandini,et al.  An experimental evaluation of a novel minimum-jerk cartesian controller for humanoid robots , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Giorgio Metta,et al.  YARP: Yet Another Robot Platform , 2006 .

[6]  Aude Billard,et al.  Combining Dynamical Systems control and programming by demonstration for teaching discrete bimanual coordination tasks to a humanoid robot , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[7]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[9]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[10]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[11]  Henrik Schiøler,et al.  Sociable Robots Through Self-Maintained Energy , 2006 .

[12]  Darwin G. Caldwell,et al.  Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Giulio Sandini,et al.  The iCub humanoid robot: an open platform for research in embodied cognition , 2008, PerMIS.

[14]  Giorgio Metta,et al.  Towards long-lived robot genes , 2008, Robotics Auton. Syst..

[15]  Jan Peters,et al.  Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.

[16]  Jens Kober Reinforcement Learning for Motor Primitives , 2008 .

[17]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[18]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.