Iterative Linearized Control: Stable Algorithms and Complexity Guarantees

We examine popular gradient-based algorithms for nonlinear control in the light of the modern complexity analysis of first-order optimization algorithms. The examination reveals that the complexity bounds can be clearly stated in terms of calls to a computational oracle related to dynamic programming and implementable by gradient back-propagation using machine learning software libraries such as PyTorch or TensorFlow. Finally, we propose a regularized Gauss-Newton algorithm enjoying worst-case complexity bounds and improved convergence behavior in practice. The software library based on PyTorch is publicly available.

[1]  Dmitriy Drusvyatskiy,et al.  Efficiency of minimizing compositions of convex functions and smooth maps , 2016, Math. Program..

[2]  Nikolai Matni,et al.  Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator , 2018, NeurIPS.

[3]  Zaïd Harchaoui,et al.  Catalyst for Gradient-based Nonconvex Optimization , 2018, AISTATS.

[4]  Emanuel Todorov,et al.  Optimal control methods suitable for biomechanical systems , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[5]  Benjamin Recht,et al.  A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[6]  Andreas Griewank,et al.  Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.

[7]  James V. Burke,et al.  Descent methods for composite nondifferentiable optimization problems , 1985, Math. Program..

[8]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[9]  E. Todorov,et al.  A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[10]  D. Mayne A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems , 1966 .

[11]  D. Bertsekas,et al.  Efficient dynamic programming implementations of Newton's method for unconstrained optimal control problems , 1989 .

[12]  Richard Bellman,et al.  Introduction to the mathematical theory of control processes , 1967 .

[13]  Emanuel Todorov,et al.  Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system , 2007, Int. J. Control.

[14]  Manfred Morari,et al.  Computational Complexity Certification for Real-Time MPC With Input Constraints Based on the Fast Gradient Method , 2012, IEEE Transactions on Automatic Control.

[15]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[16]  Yann Le Cun,et al.  A Theoretical Framework for Back-Propagation , 1988 .

[17]  Emanuel Todorov,et al.  Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[18]  Peter Whittle,et al.  Optimization Over Time , 1982 .

[19]  Yuval Tassa,et al.  Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Barbara Kaltenbacher,et al.  Iterative Regularization Methods for Nonlinear Ill-Posed Problems , 2008, Radon Series on Computational and Applied Mathematics.

[21]  Ilya Kolmanovsky,et al.  Inexact Newton–Kantorovich Methods for Constrained Nonlinear Model Predictive Control , 2019, IEEE Transactions on Automatic Control.

[22]  Nicholas I. M. Gould,et al.  On the Evaluation Complexity of Composite Function Minimization with Applications to Nonconvex Nonlinear Programming , 2011, SIAM J. Optim..

[23]  Sham M. Kakade,et al.  Provably Correct Automatic Subdifferentiation for Qualified Programs , 2018, NeurIPS.

[24]  L. Liao,et al.  Convergence in unconstrained discrete-time differential dynamic programming , 1991 .

[25]  Per Christian Hansen,et al.  Least Squares Data Fitting with Applications , 2012 .

[26]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[27]  Yurii Nesterov,et al.  Modified Gauss–Newton scheme with worst case guarantees for global performance , 2007, Optim. Methods Softw..

[28]  G. Martin,et al.  Nonlinear model predictive control , 1999, Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251).

[29]  Stephen J. Wright,et al.  A proximal method for composite minimization , 2008, Mathematical Programming.

[30]  James E. Bobrow,et al.  An efficient sequential linear quadratic algorithm for solving nonlinear optimal control problems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[31]  L. Liao,et al.  Advantages of Differential Dynamic Programming Over Newton''s Method for Discrete-time Optimal Control Problems , 1992 .

[32]  Sham M. Kakade,et al.  Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator , 2018, ICML.

[33]  J. Pantoja,et al.  Differential dynamic programming and Newton's method , 1988 .

[34]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .