Polytopic Input Constraints in Learning-Based Optimal Control Using Neural Networks

This work considers artificial feed-forward neural networks as parametric approximators in optimal control of discrete-time systems. Two different approaches are introduced to take polytopic input constraints into account. The first approach determines (sub-)optimal inputs by the application of gradient methods. Closed-form expressions for the gradient of general neural networks with respect to their inputs are derived. The approach allows to consider state-dependent input constraints, as well as to ensure the satisfaction of state constraints by exploiting recursive reachable set computations. The second approach makes use of neural networks with softmax output units to map states into parameters, which determine (sub-)optimal inputs by a convex combination of the vertices of the input constraint set. The application of both approaches in model predictive control is discussed, and results obtained for a numerical example are used for illustration.

[1]  Frank Allgöwer,et al.  Learning an Approximate Model Predictive Controller With Guarantees , 2018, IEEE Control Systems Letters.

[2]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[3]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[4]  Alberto Bemporad,et al.  Predictive Control for Linear and Hybrid Systems , 2017 .

[5]  T. Nguyen-Thien,et al.  Approximation of functions and their derivatives: A neural network implementation with applications , 1999 .

[6]  D. Bertsekas Reinforcement Learning and Optimal ControlA Selective Overview , 2018 .

[7]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[9]  Mathieu Desbrun,et al.  Barycentric coordinates for convex sets , 2007, Adv. Comput. Math..

[10]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[11]  T.,et al.  Training Feedforward Networks with the Marquardt Algorithm , 2004 .

[12]  Benjamin Karg,et al.  Efficient Representation and Approximation of Model Predictive Control Laws via Deep Learning , 2018, IEEE Transactions on Cybernetics.

[13]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[14]  Vijay Kumar,et al.  Approximating Explicit Model Predictive Control Using Constrained Neural Networks , 2018, 2018 Annual American Control Conference (ACC).

[15]  Franco Blanchini,et al.  Set invariance in control , 1999, Autom..

[16]  Trajectory Planning for Autonomous Vehicles combining Nonlinear Optimal Control and Supervised Learning , 2020 .

[17]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[18]  Manfred Morari,et al.  Safety Verification and Robustness Analysis of Neural Networks via Quadratic Constraints and Semidefinite Programming , 2019, ArXiv.

[19]  Joel A. Paulson,et al.  Approximate Closed-Loop Robust Model Predictive Control With Guaranteed Stability and Constraint Satisfaction , 2020, IEEE Control Systems Letters.

[20]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.