Minimax differential dynamic programming: application to a biped walking robot

We developed a robust control policy design method in high-dimensional state space by using differential dynamic programming with a minimax criterion. As an example, we applied our method to a simulated five link biped robot. The results show lower joint torques from the optimal control policy compared to a hand-tuned PD servo controller. Results also show that the simulated biped robot can successfully walk with unknown disturbances that cause controllers generated by standard differential dynamic programming and the hand-tuned PD servo to fail. Learning to compensate for modeling error and previously unknown disturbances in conjunction with robust control design is also demonstrated. We also applied proposed method to a real biped robot for optimizing swing leg trajectories.

[1]  Stephen R. McReynolds,et al.  The computation and theory of optimal control , 1970 .

[2]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[3]  Dan B. Marghitu,et al.  Rigid Body Collisions of Planar Kinematic Chains With Multiple Contact Points , 1994, Int. J. Robotics Res..

[4]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[5]  Christopher G. Atkeson,et al.  Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.

[6]  K. Kreutz-Delgado,et al.  Optimal biped walking with a complete dynamical model , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[7]  Katsu Yamane,et al.  Dynamics Filter - concept and implementation of online motion Generator for human figures , 2000, IEEE Trans. Robotics Autom..

[8]  Jun Morimoto,et al.  Robust Reinforcement Learning , 2005, Neural Computation.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.