A humanoid robot standing up through learning from demonstration using a multimodal reward function

Humans are known to manage postural movements in a very elegant manner. In the task of standing up from a chair, a humanoid robot can benefit from the variability of human demonstrations. In this paper we propose a novel method for humanoid robots to imitate a dynamic postural movement demonstrated by humans. Since the kinematics of human participants and the humanoid robot used in this experiment are different, we solve the correspondence problem by making comparisons in a common reward space defined by a multimodal reward function composed of balance and effort terms. We fitted a fully actuated triple inverted pendulum to model both human and robot. We used Differential Evolution to find the optimal articular trajectory that minimizes the Kullback-Leibler difference between the human's and robot's reward profile subject to constraints.

[1]  Jun Morimoto,et al.  Motion capture and reinforcement learning of dynamically stable humanoid movement primitives , 2013, 2013 IEEE International Conference on Robotics and Automation.

[2]  He-sheng Tang,et al.  Differential evolution strategy for structural system identification , 2008 .

[3]  Katsu Yamane,et al.  Sit-to-stand task on a humanoid robot from human demonstration , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[4]  Aude Billard,et al.  Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.

[5]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[6]  Nikolaos G. Tsagarakis,et al.  Bipedal walking energy minimization by reinforcement learning with evolving policy parameterization , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Jean-Paul Laumond,et al.  From human to humanoid locomotion—an inverse optimal control approach , 2010, Auton. Robots.

[8]  Stefan Schaal,et al.  http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained , 2007 .

[9]  Cletis R. Booher,et al.  Man-Systems Integration Standards Development Facility Human Factoring Nasa's Human Factors Document , 1994 .

[10]  Jun Morimoto,et al.  Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..

[11]  Oussama Khatib,et al.  Compliant Control of Multicontact and Center-of-Mass Behaviors in Humanoid Robots , 2010, IEEE Transactions on Robotics.

[12]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[13]  Mo Jamshidi,et al.  Intelligent Control Systems with an Introduction to System of Systems Engineering , 2009 .

[14]  Andrew N. Meltzoff,et al.  • THE HUMAN INFANT AS “HOMO IMITANS” , 2013 .

[15]  G. Sandini,et al.  Understanding mirror neurons. , 2006 .

[16]  Chrystopher L. Nehaniv,et al.  Imitation with ALICE: learning to imitate corresponding actions across dissimilar embodiments , 2002, IEEE Trans. Syst. Man Cybern. Part A.