The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems

Learning algorithms have shown considerable prowess in simulation by allowing robots to adapt to uncertain environments and improve their performance. However, such algorithms are rarely used in practice on safety-critical systems, since the learned policy typically does not yield any safety guarantees. That is, the required exploration may cause physical harm to the robot or its environment. In this paper, we present a method to learn accurate safety certificates for nonlinear, closed-loop dynamical systems. Specifically, we construct a neural network Lyapunov function and a training algorithm that adapts it to the shape of the largest safe region in the state space. The algorithm relies only on knowledge of inputs and outputs of the dynamics, rather than on any specific model structure. We demonstrate our method by learning the safe region of attraction for a simulated inverted pendulum. Furthermore, we discuss how our method can be used in safe learning algorithms together with statistical models of dynamical systems.

[1]  Mykel J. Kochenderfer,et al.  Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.

[2]  A. Papachristodoulou Scalable analysis of nonlinear systems using convex optimization , 2005 .

[3]  Stavros Petridis,et al.  Construction of Neural Network Based Lyapunov Functions , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[4]  Kurt Hornik,et al.  Some new results on neural network approximation , 1993, Neural Networks.

[5]  M. Vidyasagar,et al.  Maximal Lyapunov Functions and Domains of Attraction for Autonomous Nonlinear Systems , 1981 .

[6]  Pablo A. Parrilo,et al.  Introducing SOSTOOLS: a general purpose sum of squares programming solver , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[7]  Navid Noroozi,et al.  Generation of Lyapunov Functions by Neural Networks , 2008 .

[8]  A. Trofino Robust stability and domain of attraction of uncertain nonlinear systems , 2000, Proceedings of the 2000 American Control Conference. ACC (IEEE Cat. No.00CH36334).

[9]  Didier Henrion,et al.  Convex Computation of the Region of Attraction of Polynomial Control Systems , 2012, IEEE Transactions on Automatic Control.

[10]  A. Papachristodoulou,et al.  On the construction of Lyapunov functions using the sum of squares decomposition , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[11]  J. Doyle,et al.  Essentials of Robust Control , 1997 .

[12]  Torsten Koller,et al.  Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning , 2019, ArXiv.

[13]  Andreas Krause,et al.  Safe learning of regions of attraction for uncertain, nonlinear systems with Gaussian processes , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[14]  Min Wu,et al.  Safety Verification of Deep Neural Networks , 2016, CAV.

[15]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[16]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[17]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[18]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[19]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[20]  P. Olver Nonlinear Systems , 2013 .

[21]  P. Parrilo Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization , 2000 .

[22]  Mircea Lazar,et al.  A sampling approach to finding Lyapunov functions for nonlinear discrete-time systems , 2016, 2016 European Control Conference (ECC).

[23]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[24]  R. Kalman,et al.  Control system analysis and design via the second method of lyapunov: (I) continuous-time systems (II) discrete time systems , 1959 .

[25]  Sophie Tarbouriech,et al.  Antiwindup design with guaranteed regions of stability: an LMI-based approach , 2005, IEEE Transactions on Automatic Control.

[26]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[27]  R. E. Kalman,et al.  Control System Analysis and Design Via the “Second Method” of Lyapunov: II—Discrete-Time Systems , 1960 .

[28]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[29]  P. Giesl,et al.  Review on computational methods for Lyapunov functions , 2015 .

[30]  D. Hill,et al.  Stability theory for differential/algebraic systems with application to power systems , 1990 .

[31]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[32]  Andreas Krause,et al.  Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[33]  Frank L. Lewis,et al.  Optimal Control: Lewis/Optimal Control 3e , 2012 .

[34]  Ufuk Topcu,et al.  Robust Region-of-Attraction Estimation , 2010, IEEE Transactions on Automatic Control.