Backpropagation algorithms and Reservoir Computing in Recurrent Neural Networks for the forecasting of complex spatiotemporal dynamics

We examine the efficiency of Recurrent Neural Networks in forecasting the spatiotemporal dynamics of high dimensional and reduced order complex systems using Reservoir Computing (RC) and Backpropagation through time (BPTT) for gated network architectures. We highlight advantages and limitations of each method and discuss their implementation for parallel computing architectures. We quantify the relative prediction accuracy of these algorithms for the long-term forecasting of chaotic systems using as benchmarks the Lorenz-96 and the Kuramoto-Sivashinsky (KS) equations. We find that, when the full state dynamics are available for training, RC outperforms BPTT approaches in terms of predictive performance and in capturing of the long-term statistics, while at the same time requiring much less training time. However, in the case of reduced order data, large scale RC models can be unstable and more likely than the BPTT algorithms to diverge. In contrast, RNNs trained via BPTT show superior forecasting abilities and capture well the dynamics of reduced order systems. Furthermore, the present study quantifies for the first time the Lyapunov Spectrum of the KS equation with BPTT, achieving similar accuracy as RC. This study establishes that RNNs are a potent computational framework for the learning and forecasting of complex spatiotemporal systems.

[1]  Robert Jenssen,et al.  An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting , 2017, ArXiv.

[2]  Liangyue Cao,et al.  Predicting chaotic time series with wavelet networks , 1995 .

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Ioannis G. Kevrekidis,et al.  Model identification of a spatiotemporally varying catalytic reaction , 1993 .

[5]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[6]  Junjie Jiang,et al.  Model-free prediction of spatiotemporal dynamical systems with recurrent neural networks: Role of network spectral radius , 2019, Physical Review Research.

[7]  Tara N. Sainath,et al.  Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Mario A. Storti,et al.  MPI for Python: Performance improvements and MPI-2 extensions , 2008, J. Parallel Distributed Comput..

[9]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[10]  Petros Koumoutsakos,et al.  Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks , 2018, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[11]  William E. Faller,et al.  Unsteady fluid mechanics applications of neural networks , 1995 .

[12]  Wolfram Burgard,et al.  The limits and potentials of deep learning for robotics , 2018, Int. J. Robotics Res..

[13]  L Pesquera,et al.  Photonic information processing beyond Turing: an optoelectronic implementation of reservoir computing. , 2012, Optics express.

[14]  Dianhai Yu,et al.  Multi-Task Learning for Multiple Language Translation , 2015, ACL.

[15]  Henry D. I. Abarbanel,et al.  Analysis of Observed Chaotic Data , 1995 .

[16]  A. Wolf,et al.  Determining Lyapunov exponents from a time series , 1985 .

[17]  Yoshua Bengio,et al.  Unitary Evolution Recurrent Neural Networks , 2015, ICML.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  E. Ott Chaos in Dynamical Systems: Contents , 2002 .

[20]  Dimitri Palaz,et al.  Towards End-to-End Speech Recognition , 2016 .

[21]  Jaideep Pathak,et al.  Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach. , 2018, Physical review letters.

[22]  Juan-Pablo Ortega,et al.  Echo state networks are universal , 2018, Neural Networks.

[23]  Andrey V. Makarenko Deep Convolutional Neural Networks for Chaos Identification in Signal Processing , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[24]  Farmer,et al.  Predicting chaotic time series. , 1987, Physical review letters.

[25]  Louis B. Rall,et al.  Automatic differentiation , 1981 .

[26]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[27]  W. Walker,et al.  Mpi: a Standard Message Passing Interface 1 Mpi: a Standard Message Passing Interface , 1996 .

[28]  Michael B. Miller Linear Regression Analysis , 2013 .

[29]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[30]  Miguel C. Soriano,et al.  Reservoir computing with a single time-delay autonomous Boolean node , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  E. Ott Chaos in Dynamical Systems: Contents , 1993 .

[32]  S. Dreyfus The numerical solution of variational problems , 1962 .

[33]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[34]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[35]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Edward Ott,et al.  Chaos in Dynamical Systems - 2nd Edition , 2002 .

[37]  Patrice Simardy,et al.  Learning Long-Term Dependencies with , 2007 .

[38]  Petros Koumoutsakos,et al.  Data-assisted reduced-order modeling of extreme events in complex dynamical systems , 2018, PloS one.

[39]  T. Sapsis,et al.  Machine-Learning Ocean Dynamics from Lagrangian Drifter Trajectories , 2019, 1909.12895.

[40]  Prabhat,et al.  Exascale Deep Learning for Climate Analytics , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[41]  S. Linnainmaa Taylor expansion of the accumulated rounding error , 1976 .

[42]  Juan-Pablo Ortega,et al.  Reservoir Computing Universality With Stochastic Inputs , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[43]  Ramazan Gençay,et al.  The topological invariance of Lyapunov exponents in embedded dynamics , 1996 .

[44]  J. Yorke,et al.  Chaotic behavior of multidimensional difference equations , 1979 .

[45]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[46]  Yoshiki Kuramoto,et al.  Diffusion-Induced Chaos in Reaction Systems , 1978 .

[47]  Christian Mira,et al.  On Some Properties of Invariant Sets of Two-Dimensional Noninvertible Maps , 1997 .

[48]  J. A. Stewart,et al.  Nonlinear Time Series Analysis , 2015 .

[49]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[50]  Petros Koumoutsakos,et al.  Machine Learning for Fluid Mechanics , 2019, Annual Review of Fluid Mechanics.

[51]  Jürgen Schmidhuber,et al.  World Models , 2018, ArXiv.

[52]  Edward Ott,et al.  Attractor reconstruction by machine learning. , 2018, Chaos.

[53]  George A. F. Seber,et al.  Linear regression analysis , 1977 .

[54]  Mikhail Belkin,et al.  Reconciling modern machine-learning practice and the classical bias–variance trade-off , 2018, Proceedings of the National Academy of Sciences.

[55]  Hans-Georg Zimmermann,et al.  Recurrent Neural Networks Are Universal Approximators , 2006, ICANN.

[56]  A. Lapedes,et al.  Nonlinear signal processing using neural networks: Prediction and system modelling , 1987 .

[57]  Julien Clinton Sprott,et al.  Evaluating Lyapunov exponent spectra with neural networks , 2013 .

[58]  Julián Bueno Moragues Photonic Information Processing , 2019 .

[59]  Griewank,et al.  On automatic differentiation , 1988 .

[60]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[61]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[62]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[63]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[64]  Hava T. Siegelmann,et al.  On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..

[65]  Yann LeCun,et al.  Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs , 2016, ICML.

[66]  G. I. Siv Ashinsky,et al.  Nonlinear analysis of hydrodynamic instability in laminar flames—I. Derivation of basic equations , 1988 .

[67]  Elizabeth Bradley,et al.  Nonlinear time-series analysis revisited. , 2015, Chaos.

[68]  Demis Hassabis,et al.  Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.

[69]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[70]  Adrian E. Raftery,et al.  Weather Forecasting with Ensemble Methods , 2005, Science.

[71]  Laurent Larger,et al.  High-Speed Photonic Reservoir Computing Using a Time-Delay-Based Architecture: Million Words per Second Classification , 2017 .

[72]  Lloyd N. Trefethen,et al.  Fourth-Order Time-Stepping for Stiff PDEs , 2005, SIAM J. Sci. Comput..

[73]  Parlitz,et al.  Prediction of spatiotemporal time series based on reconstructed local states , 2000, Physical review letters.

[74]  Jaideep Pathak,et al.  Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data. , 2017, Chaos.

[75]  D. Hassabis,et al.  Neuroscience-Inspired Artificial Intelligence , 2017, Neuron.

[76]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[77]  E. Lorenz Predictability of Weather and Climate: Predictability – a problem partly solved , 2006 .

[78]  Mehmet Emre Çek,et al.  Analysis of observed chaotic data , 2004 .

[79]  Marios Mattheakis,et al.  Machine Learning With Observers Predicts Complex Spatiotemporal Behavior , 2018, Front. Phys..

[80]  I. Kevrekidis,et al.  Noninvertibility and resonance in discrete-time neural networks for time-series processing , 1998 .

[81]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[82]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[83]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[84]  Henry Markram,et al.  Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[85]  George Papanicolaou,et al.  Macroscopic Modelling of Turbulent Flows , 1985 .

[86]  Sawada,et al.  Measurement of the Lyapunov spectrum from a chaotic time series. , 1985, Physical review letters.

[87]  Jonathan A. Weyn,et al.  Can Machines Learn to Predict Weather? Using Deep Learning to Predict Gridded 500‐hPa Geopotential Height From Historical Weather Data , 2019, Journal of Advances in Modeling Earth Systems.

[88]  B. O. Koopman,et al.  Hamiltonian Systems and Transformation in Hilbert Space. , 1931, Proceedings of the National Academy of Sciences of the United States of America.

[89]  Michelle Girvan,et al.  Hybrid Forecasting of Chaotic Processes: Using Machine Learning in Conjunction with a Knowledge-Based Model , 2018, Chaos.

[90]  J. Yosinski,et al.  Time-series Extreme Event Forecasting with Neural Networks at Uber , 2017 .

[91]  Zhong Yi Wan,et al.  Machine learning the kinematics of spherical particles in fluid flows , 2018, Journal of Fluid Mechanics.

[92]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[93]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[94]  Serge Massar,et al.  Brain-Inspired Photonic Signal Processor for Generating Periodic Patterns and Emulating Chaotic Systems , 2017 .

[95]  Yoshua Bengio,et al.  Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.

[96]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[97]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[98]  Mantas Lukosevicius,et al.  A Practical Guide to Applying Echo State Networks , 2012, Neural Networks: Tricks of the Trade.

[99]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[100]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[101]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[102]  Steven L. Brunton,et al.  Deep learning for universal linear embeddings of nonlinear dynamics , 2017, Nature Communications.

[103]  Lisandro Dalcin,et al.  Parallel distributed computing using Python , 2011 .

[104]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[105]  F. Takens Detecting strange attractors in turbulence , 1981 .