论文信息 - Iterative Linearized Control: Stable Algorithms and Complexity Guarantees

Iterative Linearized Control: Stable Algorithms and Complexity Guarantees

We examine popular gradient-based algorithms for nonlinear control in the light of the modern complexity analysis of first-order optimization algorithms. The examination reveals that the complexity bounds can be clearly stated in terms of calls to a computational oracle related to dynamic programming and implementable by gradient back-propagation using machine learning software libraries such as PyTorch or TensorFlow. Finally, we propose a regularized Gauss-Newton algorithm enjoying worst-case complexity bounds and improved convergence behavior in practice. The software library based on PyTorch is publicly available.

[1] Dmitriy Drusvyatskiy,et al. Efficiency of minimizing compositions of convex functions and smooth maps , 2016, Math. Program..

[2] Nikolai Matni,et al. Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator , 2018, NeurIPS.

[3] Zaïd Harchaoui,et al. Catalyst for Gradient-based Nonconvex Optimization , 2018, AISTATS.

[4] Emanuel Todorov,et al. Optimal control methods suitable for biomechanical systems , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[5] Benjamin Recht,et al. A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[6] Andreas Griewank,et al. Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.

[7] James V. Burke,et al. Descent methods for composite nondifferentiable optimization problems , 1985, Math. Program..

[8] J. Navarro-Pedreño. Numerical Methods for Least Squares Problems , 1996 .

[9] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[10] D. Mayne. A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems , 1966 .

[11] D. Bertsekas,et al. Efficient dynamic programming implementations of Newton's method for unconstrained optimal control problems , 1989 .

[12] Richard Bellman,et al. Introduction to the mathematical theory of control processes , 1967 .

[13] Emanuel Todorov,et al. Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system , 2007, Int. J. Control.

[14] Manfred Morari,et al. Computational Complexity Certification for Real-Time MPC With Input Constraints Based on the Fast Gradient Method , 2012, IEEE Transactions on Automatic Control.

[15] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.

[16] Yann Le Cun,et al. A Theoretical Framework for Back-Propagation , 1988 .

[17] Emanuel Todorov,et al. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.

[18] Peter Whittle,et al. Optimization Over Time , 1982 .

[19] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20] Barbara Kaltenbacher,et al. Iterative Regularization Methods for Nonlinear Ill-Posed Problems , 2008, Radon Series on Computational and Applied Mathematics.

[21] Ilya Kolmanovsky,et al. Inexact Newton–Kantorovich Methods for Constrained Nonlinear Model Predictive Control , 2019, IEEE Transactions on Automatic Control.

[22] Nicholas I. M. Gould,et al. On the Evaluation Complexity of Composite Function Minimization with Applications to Nonconvex Nonlinear Programming , 2011, SIAM J. Optim..

[23] Sham M. Kakade,et al. Provably Correct Automatic Subdifferentiation for Qualified Programs , 2018, NeurIPS.

[24] L. Liao,et al. Convergence in unconstrained discrete-time differential dynamic programming , 1991 .

[25] Per Christian Hansen,et al. Least Squares Data Fitting with Applications , 2012 .

[26] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[27] Yurii Nesterov,et al. Modified Gauss–Newton scheme with worst case guarantees for global performance , 2007, Optim. Methods Softw..

[28] G. Martin,et al. Nonlinear model predictive control , 1999, Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251).

[29] Stephen J. Wright,et al. A proximal method for composite minimization , 2008, Mathematical Programming.

[30] James E. Bobrow,et al. An efficient sequential linear quadratic algorithm for solving nonlinear optimal control problems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[31] L. Liao,et al. Advantages of Differential Dynamic Programming Over Newton''s Method for Discrete-time Optimal Control Problems , 1992 .

[32] Sham M. Kakade,et al. Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator , 2018, ICML.

[33] J. Pantoja,et al. Differential dynamic programming and Newton's method , 1988 .

[34] Yuval Tassa,et al. Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[35] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .