论文信息 - Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid

Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid

Modern humanoid robots include not only active compliance but also passive compliance. Apart from improved safety and dependability, availability of passive elements, such as springs, opens up new possibilities for improving the energy efficiency. With this in mind, this paper addresses the challenging open problem of exploiting the passive compliance for the purpose of energy efficient humanoid walking. To this end, we develop a method comprising two parts: an optimization part that finds an optimal vertical center-of-mass trajectory, and a walking pattern generator part that uses this trajectory to produce a dynamically-balanced gait. For the optimization part, we propose a reinforcement learning approach that dynamically evolves the policy parametrization during the learning process. By gradually increasing the representational power of the policy parametrization, it manages to find better policies in a faster and computationally efficient way. For the walking generator part, we develop a variable-center-of-mass-height ZMP-based bipedal walking pattern generator. The method is tested in real-world experiments with the bipedal robot COMAN and achieves a significant 18% reduction in the electric energy consumption by learning to efficiently use the passive compliance of the robot.

[1] Masayuki Inaba,et al. A Fast Dynamically Equilibrated Walking Trajectory Generation Method of Humanoid Robot , 2002, Auton. Robots.

[2] Karsten Berns,et al. Adaptive motor patterns and reflexes for bipedal locomotion on rough terrain , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3] Nikolaos G. Tsagarakis,et al. Bipedal walking energy minimization by reinforcement learning with evolving policy parameterization , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4] Filipe Miguel Teixeira Pereira da Silva,et al. Biped Walking Learning from Imitation Using Dynamic Movement Primitives , 2015, ROBOT.

[5] Barkan Ugurlu,et al. Compliant joint modification and real-time dynamic walking implementation on bipedal robot cCub , 2011, 2011 IEEE International Conference on Mechatronics.

[6] Jan Peters,et al. Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.

[7] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.

[8] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9] Tad McGeer,et al. Passive Dynamic Walking , 1990, Int. J. Robotics Res..

[10] Atsuo Kawamura,et al. A unified control frame for stable bipedal walking , 2009, 2009 35th Annual Conference of IEEE Industrial Electronics.

[11] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[12] Jan Peters,et al. An experimental comparison of Bayesian optimization for bipedal locomotion , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[13] Frans C. T. van der Helm,et al. How to keep from falling forward: elementary swing leg action for passive dynamic walkers , 2005, IEEE Transactions on Robotics.

[14] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[15] Y. Wada,et al. A reinforcement learning scheme for acquisition of via-point representation of human motion , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[16] David E. Orin,et al. Centroidal dynamics of a humanoid robot , 2013, Auton. Robots.

[17] Katja Mombaur,et al. Compliance analysis of human leg joints in level ground walking with an optimal control approach , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[18] Stefan Schaal,et al. Skill learning and task outcome prediction for manipulation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19] Darwin G. Caldwell,et al. Learning Fast Quadruped Robot Gaits with the RL PoWER Spline Parameterization , 2012 .

[20] Reinhard Blickhan,et al. Compliant leg behaviour explains basic dynamics of walking and running , 2006, Proceedings of the Royal Society B: Biological Sciences.

[21] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.

[22] C. T. Farley,et al. Minimizing center of mass vertical movement increases metabolic cost in walking. , 2005, Journal of applied physiology.

[23] Youngjin Choi,et al. Posture/Walking Control for Humanoid Robot Based on Kinematic Resolution of CoM Jacobian With Embedded Motion , 2007, IEEE Transactions on Robotics.

[24] Darwin G. Caldwell,et al. Upper-body kinesthetic teaching of a free-standing humanoid robot , 2011, 2011 IEEE International Conference on Robotics and Automation.

[25] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[26] Shuuji Kajita,et al. An analytical method on real-time gait planning for a humanoid robot , 2004, 4th IEEE/RAS International Conference on Humanoid Robots, 2004..

[27] Jun Morimoto,et al. Learning Biped Locomotion , 2007, IEEE Robotics & Automation Magazine.

[28] Kai Henning Koch,et al. Learning movement primitives from optimal and dynamically feasible trajectories for humanoid walking , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[29] Darwin G. Caldwell,et al. Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[31] Andrey Bernstein,et al. Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains , 2010, Machine Learning.

[32] Hirokazu Seki,et al. A Study of Energy-Saving Shoes for Robot Considering Lateral Plane Motion , 2008, IEEE Transactions on Industrial Electronics.

[33] Shuuji Kajita,et al. An Analytical Method for Real-Time Gait Planning for Humanoid Robots , 2006, Int. J. Humanoid Robotics.

[34] Luís Paulo Reis,et al. Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller , 2016, J. Intell. Robotic Syst..

[35] Fumiya Iida,et al. Minimalistic Models of an Energy-Efficient Vertical-Hopping Robot , 2013, IEEE Transactions on Industrial Electronics.

[36] Michael T. Rosenstein,et al. Learning at the level of synergies for a robot weightlifter , 2006, Robotics Auton. Syst..

[37] Kazuhito Yokoi,et al. Biped walking pattern generation by using preview control of zero-moment point , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[38] Yoshihiko Nakamura,et al. Boundary Condition Relaxation Method for Stepwise Pedipulation Planning of Biped Robots , 2009, IEEE Transactions on Robotics.

[39] Stefan Schaal,et al. Reinforcement learning of full-body humanoid motor skills , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[40] Nikolaos G. Tsagarakis,et al. Bipedal Hopping Pattern Generation for Passively Compliant Humanoids: Exploiting the Resonance , 2014, IEEE Transactions on Industrial Electronics.

[41] Aude Billard,et al. Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.

[42] Olivier Stasse,et al. A versatile and efficient pattern generator for generalized legged locomotion , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[43] Pieter Abbeel,et al. Apprenticeship learning for helicopter control , 2009, CACM.

[44] N. G. Tsagarakis,et al. A Novel Intrinsically Energy Efficient Actuator With Adjustable Stiffness (AwAS) , 2013, IEEE/ASME Transactions on Mechatronics.

[45] P. Komi,et al. Muscle-tendon interaction and elastic energy usage in human walking. , 2005, Journal of applied physiology.

[46] Alexander Herzog,et al. Structured contact force optimization for kino-dynamic motion generation , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[47] Luís Paulo Reis,et al. Learning to Walk Fast: Optimized Hip Height Movement for Simulated and Real Humanoid Robots , 2015, J. Intell. Robotic Syst..

[48] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.

[49] Jun Morimoto,et al. Reinforcement learning with via-point representation , 2004, Neural Networks.

[50] Fumiya Iida,et al. Minimalistic Models of an Energy-Efficient Vertical-Hopping Robot , 2014, IEEE Trans. Ind. Electron..

[51] Jan Peters,et al. Toward fast policy search for learning legged locomotion , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[52] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.

[53] Stefan Schaal,et al. Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[54] Atsuo Kawamura,et al. Energy and torque efficient ZMP-based bipedal walking with varying center of mass height , 2010, 2010 11th IEEE International Workshop on Advanced Motion Control (AMC).