Game Theoretic continuous time Differential Dynamic Programming

In this work, we derive a Game Theoretic Differential Dynamic Programming (GT-DDP) algorithm in continuous time. We provide a set of backward differential equations for the value function expansion without assuming closeness of the initial nominal control to the optimal control solution, and derive the update law for the controls. We introduce the GT-DDP algorithm and analyze the effect of the game theoretic formulation in the feed-forward and feedback parts of the control policies. Furthermore, we investigate the performance of GT-DDP through simulations on the inverted pendulum with conflicting controls and we apply the control gains on a stochastic system to demonstrate the effect of the design of the cost function to the feed-forward and feedback parts of the control policies. Finally, we conclude with some possible future directions.

[1]  Yuval Tassa,et al.  Infinite-Horizon Model Predictive Control for Periodic Tasks with Contacts , 2011, Robotics: Science and Systems.

[2]  Jun Morimoto,et al.  Minimax Differential Dynamic Programming: An Application to Robust Biped Walking , 2002, NIPS.

[3]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Christopher G. Atkeson,et al.  Random Sampling of States in Dynamic Programming , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[6]  C. Atkeson,et al.  Minimax differential dynamic programming: application to a biped walking robot , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[7]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[8]  Yuval Tassa,et al.  Stochastic Differential Dynamic Programming , 2010, Proceedings of the 2010 American Control Conference.

[9]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[10]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).