Control-limited differential dynamic programming

Trajectory optimizers are a powerful class of methods for generating goal-directed robot motion. Differential Dynamic Programming (DDP) is an indirect method which optimizes only over the unconstrained control-space and is therefore fast enough to allow real-time control of a full humanoid robot on modern computers. Although indirect methods automatically take into account state constraints, control limits pose a difficulty. This is particularly problematic when an expensive robot is strong enough to break itself. In this paper, we demonstrate that simple heuristics used to enforce limits (clamping and penalizing) are not efficient in general. We then propose a generalization of DDP which accommodates box inequality constraints on the controls, without significantly sacrificing convergence quality or computational effort. We apply our algorithm to three simulated problems, including the 36-DoF HRP-2 robot. A movie of our results can be found here goo.gl/eeiMnn.

[1]  L. S. Pontryagin,et al.  Mathematical Theory of Optimal Processes , 1962 .

[2]  M. L. Chambers The Mathematical Theory of Optimal Processes , 1965 .

[3]  L. Armijo Minimization of functions having Lipschitz continuous first partial derivatives. , 1966 .

[4]  D. Mayne A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems , 1966 .

[5]  D. Jacobson New second-order and first-order algorithms for determining optimal control: A differential dynamic programming approach , 1968 .

[6]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[7]  S. Yakowitz,et al.  Constrained differential dynamic programming and its application to multireservoir control , 1979 .

[8]  D. Bertsekas Projected Newton methods for optimization problems with simple constraints , 1981, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[9]  Oussama Khatib,et al.  A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[10]  Claude Samson,et al.  Robot Control: The Task Function Approach , 1991 .

[11]  L. Liao,et al.  Advantages of Differential Dynamic Programming Over Newton''s Method for Discrete-time Optimal Control Problems , 1992 .

[12]  Patrick Rives,et al.  A new approach to visual servoing in robotics , 1992, IEEE Trans. Robotics Autom..

[13]  Oskar von Stryk,et al.  Direct and indirect methods for trajectory optimization , 1992, Ann. Oper. Res..

[14]  Philippe Souères,et al.  Optimal trajectories for nonholonomic mobile robots , 1998 .

[15]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[16]  Paolo Baerlocher,et al.  Inverse kinematics techniques of the interactive posture control of articulated figures , 2001 .

[17]  Toshikazu Kawasaki,et al.  Design of prototype humanoid robotics platform for HRP , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[19]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[20]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21]  Olivier Stasse,et al.  Visually-Guided Grasping while Walking on a Humanoid Robot , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[22]  Oussama Khatib,et al.  Synthesis and control of whole-body behaviors in humanoid systems , 2007 .

[23]  Oussama Khatib,et al.  Compliant motion control for a humanoid robot in contact with the environment and humans , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Katja D. Mombaur,et al.  Using optimization to create self-stable human-like running , 2009, Robotica.

[25]  Hans Joachim Ferreau,et al.  Efficient Numerical Methods for Nonlinear MPC and Moving Horizon Estimation , 2009 .

[26]  Oussama Khatib,et al.  A Unified Approach to Integrate Unilateral Constraints in the Stack of Tasks , 2009, IEEE Transactions on Robotics.

[27]  A. Billard,et al.  Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.

[28]  Yuval Tassa,et al.  Stochastic Complementarity for Local Control of Discontinuous Dynamics , 2010, Robotics: Science and Systems.

[29]  Aude Billard,et al.  Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.

[30]  Olivier Stasse,et al.  Reverse Control for Humanoid Robot Task Recognition , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[31]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[33]  Riccardo Muradore,et al.  Inertial parameter identification including friction and motor dynamics , 2014, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[34]  François Keith,et al.  Dynamic Whole-Body Motion Generation Under Rigid Contacts and Other Unilateral Constraints , 2013, IEEE Transactions on Robotics.

[35]  Yuval Tassa,et al.  Modeling and identification of pneumatic actuators , 2013, 2013 IEEE International Conference on Mechatronics and Automation.