Stability of Gated Recurrent Unit Neural Networks: Convex Combination Formulation Approach

In this paper, a particular discrete-time nonlinear and time-invariant system represented as a vector difference equation is analyzed for its stability properties. The motivation for analyzing this particular system is that it models gated recurrent unit neural networks commonly used and well known in machine learning applications. From the technical perspective, the analyses exploit the systems similarities to a convex combination of discrete-time systems, where one of the systems is trivial, and thus, the overall properties are mostly dependent on the other one. Stability results are formulated for the nonlinear system and its linearization with respect to the systems, in general, multiple equilibria. To motivate and illustrate the potential of these results in applications, some particular results are derived for the gated recurrent unit neural network models and a connection between local stability analysis and learning is provided.

[1]  Nils Bertschinger,et al.  Real-Time Computation at the Edge of Chaos in Recurrent Neural Networks , 2004, Neural Computation.

[2]  G. Leonov Strange attractors and classical stability theory , 2006 .

[3]  B. Paden,et al.  Lyapunov stability theory of nonsmooth systems , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[4]  Derong Liu,et al.  Qualitative Analysis and Synthesis of Recurrent Neural Networks , 2002 .

[5]  M. Hirsch,et al.  Differential Equations, Dynamical Systems, and an Introduction to Chaos , 2003 .

[6]  Boris Murmann,et al.  Some Local Stability Properties of an Autonomous Long Short-Term Memory Neural Network Model , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).

[7]  Milos S. Stankovic,et al.  Decentralized Parameter Estimation by Consensus Based Stochastic Approximation , 2011, IEEE Trans. Autom. Control..

[8]  Dušan M. Stipanović,et al.  Stability of polytopic systems via convex M -matrices and parameter-dependent Liapunov functions , 2000 .

[9]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[10]  B. Ross Barmish,et al.  New Tools for Robustness of Linear Systems , 1993 .

[11]  Boris Murmann,et al.  Long-Short Term Memory Neural Network Stability and Stabilization using Linear Matrix Inequalities , 2019, 2019 IEEE International Symposium on Circuits and Systems (ISCAS).

[12]  J. P. Lasalle The stability and control of discrete processes , 1986 .

[13]  Yoshua Bengio,et al.  Light Gated Recurrent Units for Speech Recognition , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[14]  Yasuhiro Fujiwara,et al.  Preventing Gradient Explosions in Gated Recurrent Units , 2017, NIPS.

[15]  Aleksej F. Filippov,et al.  Differential Equations with Discontinuous Righthand Sides , 1988, Mathematics and Its Applications.

[16]  D. D. Šiljak,et al.  Connective Stability of Discontinuous Dynamic Systems , 2002 .

[17]  Boris Murmann,et al.  Global Asymptotic Stability and Stabilization of Long Short-Term Memory Neural Networks with Constant Weights and Biases , 2018, J. Optim. Theory Appl..

[18]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[20]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.