Gershgorin Loss Stabilizes the Recurrent Neural Network Compartment of an End-to-end Robot Learning Scheme

Traditional robotic control suits require profound task-specific knowledge for designing, building and testing control software. The rise of Deep Learning has enabled end-to-end solutions to be learned entirely from data, requiring minimal knowledge about the application area. We design a learning scheme to train end-to-end linear dynamical systems (LDS)s by gradient descent in imitation learning robotic domains. We introduce a new regularization loss component together with a learning algorithm that improves the stability of the learned autonomous system, by forcing the eigenvalues of the internal state updates of an LDS to be negative reals. We evaluate our approach on a series of real-life and simulated robotic experiments, in comparison to linear and nonlinear Recurrent Neural Network (RNN) architectures. Our results show that our stabilizing method significantly improves test performance of LDS, enabling such linear models to match the performance of contemporary nonlinear RNN architectures. A video of the obstacle avoidance performance of our method on a mobile robot, in unseen environments, compared to other methods can be viewed at https://youtu.be/mhEsCoNao5E.

[1]  Radu Grosu,et al.  Response Characterization for Auditing Cell Dynamics in Long Short-term Memory Networks , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[2]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[3]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[4]  Geoffrey Biggs,et al.  A Survey of Robot Programming Systems , 2010 .

[5]  Aude Billard,et al.  Learning control Lyapunov function to ensure stability of dynamical system-based robot reaching motions , 2014, Robotics Auton. Syst..

[6]  Saibal Mukhopadhyay,et al.  HybridNet: Integrating Model-based and Data-driven Learning to Predict Evolution of Dynamical Systems , 2018, CoRL.

[7]  Caroline Blocher,et al.  Learning stable dynamical systems using contraction theory , 2017, 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).

[8]  Sergey Levine,et al.  One-Shot Visual Imitation Learning via Meta-Learning , 2017, CoRL.

[9]  Marco Pavone,et al.  Robust online motion planning via contraction theory and convex optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Juan Carlos Niebles,et al.  Visual Forecasting by Imitating Dynamics in Natural Sequences , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[12]  Radu Grosu,et al.  Designing Worm-inspired Neural Networks for Interpretable Robotic Control , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[13]  Aude Billard,et al.  A Physically-Consistent Bayesian Non-Parametric Mixture Model for Dynamical System Learning , 2018, CoRL.

[14]  Razvan Pascanu,et al.  Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[15]  Lin-Bao Yang,et al.  Cellular neural networks: theory , 1988 .

[16]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[18]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[19]  Ashwin P. Dani,et al.  Learning Partially Contracting Dynamical Systems from Demonstrations , 2017, CoRL.

[20]  Sergey Levine,et al.  Learning deep neural network policies with continuous memory states , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[22]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  T. Sideris Ordinary Differential Equations and Dynamical Systems , 2013 .

[25]  Aude Billard,et al.  Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.

[26]  Radu Grosu,et al.  Liquid Time-constant Recurrent Neural Networks as Universal Approximators , 2018, ArXiv.

[27]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[28]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[29]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[30]  Klaus Neumann,et al.  Neural learning of vector fields for encoding stable dynamical systems , 2014, Neurocomputing.

[31]  Mathukumalli Vidyasagar,et al.  A learning theory approach to system identification and stochastic adaptive control , 2008 .

[32]  William H. Press,et al.  Numerical Recipes 3rd Edition: The Art of Scientific Computing , 2007 .

[33]  Tengyu Ma,et al.  Gradient Descent Learns Linear Dynamical Systems , 2016, J. Mach. Learn. Res..

[34]  Yuichi Nakamura,et al.  Approximation of dynamical systems by continuous time recurrent neural networks , 1993, Neural Networks.

[35]  Rajesh P. N. Rao,et al.  Robotic imitation from human motion capture using Gaussian processes , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[36]  David P. Rodgers,et al.  Improvements in multiprocessor system design , 1985, ISCA '85.

[37]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[38]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[39]  Andreas Krause,et al.  The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems , 2018, CoRL.

[40]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.