Simulation-based design of dynamic controllers for humanoid balancing

Model-based trajectory optimization often fails to find a reference trajectory for under-actuated bipedal robots performing highly-dynamic, contact-rich tasks in the real world due to inaccurate physical models. In this paper, we propose a complete system that automatically designs a reference trajectory that succeeds on tasks in the real world with a very small number of real world experiments. We adopt existing system identification techniques and show that, with appropriate model parameterization and control optimization, an iterative system identification framework can be effective for designing reference trajectories. We focus on a set of tasks that leverage the momentum transfer strategy to rapidly change the whole-body from an initial configuration to a target configuration by generating large accelerations at the center of mass and switching contacts.

[1]  Shuuji Kajita,et al.  The Human-Size Humanoid Robot That Can Walk, Lie Down and Get Up , 2003, ISRR.

[2]  Chris A McGibbon,et al.  Chair rise strategies in older adults with functional limitations. , 2007, Journal of rehabilitation research and development.

[3]  Ben Jones Rising motion controllers for physically simulated characters , 2011 .

[4]  Jun Morimoto,et al.  Minimax differential dynamic programming: application to a biped walking robot , 2003, SICE 2003 Annual Conference (IEEE Cat. No.03TH8734).

[5]  Jun Morimoto,et al.  Conference on Intelligent Robots and Systems Reinforcement Le,arning of Dynamic Motor Sequence: Learning to Stand Up , 2022 .

[6]  Zoran Popovic,et al.  Terrain-adaptive bipedal locomotion control , 2010, ACM Transactions on Graphics.

[7]  Sethu Vijayakumar,et al.  Adaptive Optimal Feedback Control with Learned Internal Dynamics Models , 2010, From Motor Learning to Interaction Learning in Robots.

[8]  C. Karen Liu,et al.  Learning bicycle stunts , 2014, ACM Trans. Graph..

[9]  Sven Behnke,et al.  Compliant Robot Behavior Using Servo Actuator Models Identified by Iterative Learning Control , 2013, RoboCup.

[10]  Emanuel Todorov,et al.  Physically consistent state estimation and system identification for contacts , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[11]  Pieter Abbeel,et al.  Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[12]  Petros Faloutsos,et al.  Composable controllers for physics-based character animation , 2001, SIGGRAPH.

[13]  Yuval Tassa,et al.  Simulation tools for model-based robotics: Comparison of Bullet, Havok, MuJoCo, ODE and PhysX , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[15]  Javier Ruiz-del-Solar,et al.  Back to reality: Crossing the reality gap in evolutionary robotics , 2004 .

[16]  Petros Faloutsos,et al.  Autonomous reactive control for simulated humanoids , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[17]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.

[18]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Shohei Kato,et al.  Reinforcement learning for motion control of humanoid robots , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[20]  Katsu Yamane,et al.  Sit-to-stand task on a humanoid robot from human demonstration , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[21]  Pieter Abbeel,et al.  Using inaccurate models in reinforcement learning , 2006, ICML.

[22]  David J. Fleet,et al.  Optimizing walking controllers for uncertain inputs and environments , 2010, SIGGRAPH 2010.

[23]  Wen-Chieh Lin,et al.  Animating rising up from various lying postures and environments , 2011, The Visual Computer.

[24]  Leonid B. Freidovich,et al.  Natural sit-down and chair-rise motions for a humanoid robot , 2010, 49th IEEE Conference on Decision and Control (CDC).

[25]  Michel Gevers,et al.  System identification without Lennart Ljung : what would have been different ? , 2006 .

[26]  Lennart Ljung,et al.  System identification (2nd ed.): theory for the user , 1999 .

[27]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  Sehoon Ha,et al.  Multiple contact planning for minimizing damage of humanoid falls , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[29]  Sehoon Ha,et al.  Reducing hardware experiments for model learning and policy optimization , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[31]  Mitsuharu Morisawa,et al.  Getting up Motion Planning using Mahalanobis Distance , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[32]  Emanuel Todorov,et al.  Ensemble-CIO: Full-body dynamic motion planning that transfers to physical humanoids , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[33]  C. K. Liu,et al.  A Quick Tutorial on Multibody Dynamics , 2012 .