Contracting Implicit Recurrent Neural Networks: Stable Models with Improved Trainability

Stability of recurrent models is closely linked with trainability, generalizability and in some applications, safety. Methods that train stable recurrent neural networks, however, do so at a significant cost to expressibility. We propose an implicit model structure that allows for a convex parametrization of stable models using contraction analysis of non-linear systems. Using these stability conditions we propose a new approach to model initialization and then provide a number of empirical results comparing the performance of our proposed model set to previous stable RNNs and vanilla RNNs. By carefully controlling stability in the model, we observe a significant increase in the speed of training and model performance.

[1]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[2]  Nicolas Boumal,et al.  The non-convex Burer-Monteiro approach works on smooth semidefinite programs , 2016, NIPS.

[3]  Douglas C. Hittle,et al.  Robust Reinforcement Learning Control Using Integral Quadratic Constraints for Recurrent Neural Networks , 2007, IEEE Transactions on Neural Networks.

[4]  Yun-Chung Chu,et al.  Bounds of the induced norm and model reduction errors for systems with repeated scalar nonlinearities , 1999, IEEE Trans. Autom. Control..

[5]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[6]  Rodolphe Sepulchre,et al.  A Differential Lyapunov Framework for Contraction Analysis , 2012, IEEE Transactions on Automatic Control.

[7]  T. Tao,et al.  RANDOM MATRICES: THE CIRCULAR LAW , 2007, 0708.2895.

[8]  Nikita Barabanov,et al.  Stability analysis of discrete-time recurrent neural networks , 2002, IEEE Trans. Neural Networks.

[9]  E. Kaszkurewicz,et al.  On a class of globally stable neural circuits , 1994 .

[10]  B. Wahlberg System identification using Laguerre models , 1991 .

[11]  Stewart Worrall,et al.  Long short term memory for driver intent prediction , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[12]  Moritz Hardt,et al.  Stable Recurrent Models , 2018, ICLR.

[13]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[14]  Ian R. Manchester,et al.  Predicting Transitioning Walking Gaits: Hip and Knee Joint Trajectories From the Motion of Walking Canes , 2019, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Ian R. Manchester,et al.  Convex Bounds for Equation Error in Stable Nonlinear Identification , 2019, IEEE Control Systems Letters.

[17]  Danilo P. Mandic,et al.  Recurrent Neural Networks for Prediction: Learning Algorithms, Architectures and Stability , 2001 .

[18]  Geoffrey E. Hinton,et al.  A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.

[19]  Inderjit S. Dhillon,et al.  Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization , 2018, ICML.

[20]  James Bailey,et al.  Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections , 2016, ICML.

[21]  E. Kaszkurewicz,et al.  Robust stability and diagonal Liapunov functions , 1990, 29th IEEE Conference on Decision and Control.

[22]  Renato D. C. Monteiro,et al.  A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization , 2003, Math. Program..

[23]  Ian R. Manchester,et al.  Prediction of Smooth Gait Transitioning for Active Lower Limb Prosthetics , 2019, 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[24]  Zhang Yi,et al.  Convergence Analysis of Recurrent Neural Networks , 2003, Network Theory and Applications.

[25]  Herbert Jaeger,et al.  Adaptive Nonlinear System Identification with Echo State Networks , 2002, NIPS.

[26]  Jürgen Schmidhuber,et al.  A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  E. Kaszkurewicz,et al.  Robust stability and diagonal Lyapunov functions , 1993 .

[28]  Eduardo Sontag Input to State Stability: Basic Concepts and Results , 2008 .

[29]  G. Zames On the input-output stability of time-varying nonlinear feedback systems Part one: Conditions derived using concepts of loop gain, conicity, and positivity , 1966 .

[30]  Alexandre Megretski,et al.  Convex Parameterizations and Fidelity Bounds for Nonlinear Identification and Reduced-Order Modelling , 2017, IEEE Transactions on Automatic Control.

[31]  Lennart Ljung,et al.  Nonlinear black-box modeling in system identification: a unified overview , 1995, Autom..

[32]  E. Kaszkurewicz,et al.  Matrix diagonal stability in systems and computation , 1999 .

[33]  Ian R. Manchester,et al.  Specialized Interior-Point Algorithm for Stable Nonlinear System Identification , 2018, IEEE Transactions on Automatic Control.

[34]  Jean-Jacques E. Slotine,et al.  On Contraction Analysis for Non-linear Systems , 1998, Autom..

[35]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[36]  C. Anderson,et al.  Stable reinforcement learning with recurrent neural networks , 2011 .

[37]  David Angeli,et al.  A Lyapunov approach to incremental stability properties , 2002, IEEE Trans. Autom. Control..

[38]  Stephen A. Billings,et al.  Non-linear system identification using neural networks , 1990 .

[39]  Stephen P. Boyd,et al.  CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..