Synthesis and stabilization of complex behaviors through online trajectory optimization

We present an online trajectory optimization method and software platform applicable to complex humanoid robots performing challenging tasks such as getting up from an arbitrary pose on the ground and recovering from large disturbances using dexterous acrobatic maneuvers. The resulting behaviors, illustrated in the attached video, are computed only 7 × slower than real time, on a standard PC. The video also shows results on the acrobot problem, planar swimming and one-legged hopping. These simpler problems can already be solved in real time, without pre-computing anything.

[1]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[2]  L. Liao,et al.  Convergence in unconstrained discrete-time differential dynamic programming , 1991 .

[3]  L. Liao,et al.  Advantages of Differential Dynamic Programming Over Newton''s Method for Discrete-time Optimal Control Problems , 1992 .

[4]  David E. Stewart,et al.  Rigid-Body Dynamics with Friction and Impact , 2000, SIAM Rev..

[5]  G. Sohl,et al.  A Recursive Multibody Dynamics and Sensitivity Algorithm for Branched Kinematic Chains , 2001 .

[6]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[7]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[8]  Hans Joachim Ferreau,et al.  Efficient Numerical Methods for Nonlinear MPC and Moving Horizon Estimation , 2009 .

[9]  Yuval Tassa,et al.  Stochastic Complementarity for Local Control of Discontinuous Dynamics , 2010, Robotics: Science and Systems.

[10]  Emanuel Todorov,et al.  Implicit nonlinear complementarity: A new approach to contact dynamics , 2010, 2010 IEEE International Conference on Robotics and Automation.

[11]  Emanuel Todorov,et al.  First-exit model predictive control of fast discontinuous dynamics: Application to ball bouncing , 2011, 2011 IEEE International Conference on Robotics and Automation.

[12]  Emanuel Todorov,et al.  A convex, smooth and invertible contact model for trajectory optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[13]  Yuval Tassa,et al.  Infinite-Horizon Model Predictive Control for Periodic Tasks with Contacts , 2011, Robotics: Science and Systems.

[14]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.