Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data

Recurrent neural networks (RNNs) are nonlinear dynamical models commonly used in the machine learning and dynamical systems literature to represent complex dynamical or sequential relationships between variables. Recently, as deep learning models have become more common, RNNs have been used to forecast increasingly complicated systems. Dynamical spatio-temporal processes represent a class of complex systems that can potentially benefit from these types of models. Although the RNN literature is expansive and highly developed, uncertainty quantification is often ignored. Even when considered, the uncertainty is generally quantified without the use of a rigorous framework, such as a fully Bayesian setting. Here we attempt to quantify uncertainty in a more formal framework while maintaining the forecast accuracy that makes these models appealing, by presenting a Bayesian RNN model for nonlinear spatio-temporal forecasting. Additionally, we make simple modifications to the basic RNN to help accommodate the unique nature of nonlinear spatio-temporal data. The proposed model is applied to a Lorenz simulation and two real-world nonlinear spatio-temporal forecasting applications.

[1]  Nicholas G. Polson,et al.  Deep learning for spatio‐temporal modeling: Dynamic traffic flows and high frequency trading , 2017, Applied Stochastic Models in Business and Industry.

[2]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[3]  A. Barnston,et al.  Observing and Predicting the 2015/16 El Niño , 2017 .

[4]  F. Tangang,et al.  Forecasting ENSO Events: A Neural Network–Extended EOF Approach. , 1998 .

[5]  A. Barnston,et al.  Predictive Skill of Statistical and Dynamical Climate Models in SST Forecasts during the 1997-98 El Niño Episode and the 1998 La Niña Onset. , 1999 .

[6]  James P. Hobert,et al.  The Data Augmentation Algorithm: Theory and Methodology , 2011 .

[7]  William W. Hsieh,et al.  Skill Comparisons between Neural Networks and Canonical Correlation Analysis in Predicting the Equatorial Pacific Sea Surface Temperatures , 2000 .

[8]  Mantas Lukosevicius,et al.  A Practical Guide to Applying Echo State Networks , 2012, Neural Networks: Tricks of the Trade.

[9]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[10]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[11]  Christopher K. Wikle,et al.  Modern perspectives on statistics for spatio‐temporal data , 2015 .

[12]  Michael Ghil,et al.  Multilevel Regression Modeling of Nonlinear Processes: Derivation and Applications to Climatic Variability , 2005 .

[13]  Alexandre J. Chorin,et al.  Discrete approach to stochastic parametrization and dimension reduction in nonlinear dynamics , 2015, Proceedings of the National Academy of Sciences.

[14]  Herbert Jaeger,et al.  The''echo state''approach to analysing and training recurrent neural networks , 2001 .

[15]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[16]  Vadim Sokolov,et al.  Deep Learning: A Bayesian Perspective , 2017, ArXiv.

[17]  Christopher K. Wikle,et al.  An ensemble quadratic echo state network for non‐linear spatio‐temporal forecasting , 2017, 1708.05094.

[18]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[19]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[20]  Carlo Novara,et al.  Nonlinear Time Series , 2003 .

[21]  Joan Lasenby,et al.  Bayesian LSTMs in medicine , 2017, ArXiv.

[22]  Yoonsang Lee,et al.  A framework for variational data assimilation with superparameterization , 2015 .

[23]  M. Hooten,et al.  A general science-based framework for dynamical spatio-temporal models , 2010 .

[24]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[25]  Zhihai He,et al.  Spatially supervised recurrent convolutional neural networks for visual object tracking , 2016, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[26]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[27]  David L. T. Anderson,et al.  Did the ECMWF seasonal forecast model outperform statistical ENSO forecast models over the last 15 years , 2005 .

[28]  Robert Richardson,et al.  Sparsity in nonlinear dynamic spatiotemporal models using implied advection , 2017 .

[29]  E. George,et al.  The Spike-and-Slab LASSO , 2018 .

[30]  Sotirios Chatzis,et al.  Sparse Bayesian Recurrent Neural Networks , 2015, ECML/PKDD.

[31]  Jen-Tzung Chien,et al.  Bayesian Recurrent Neural Network for Language Modeling , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[32]  J. Hobert,et al.  A theoretical comparison of the data augmentation, marginal augmentation and PX-DA algorithms , 2008, 0804.0671.

[33]  Jonathan R. Bradley,et al.  Bayesian Spatial Change of Support for Count-Valued Survey Data With Application to the American Community Survey , 2014, 1405.7227.

[34]  Noel A Cressie,et al.  Long-Lead Prediction of Pacific SSTs via Bayesian Dynamic Modeling , 2000 .

[35]  Christopher K. Wikle,et al.  Hierarchical Bayesian Spatio-Temporal Conway–Maxwell Poisson Models with Dynamic Dispersion , 2013 .

[36]  Christopher K. Wikle,et al.  Physically motivated scale interaction parameterization in reduced rank quadratic nonlinear dynamic spatio‐temporal models , 2014 .

[37]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[38]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[39]  R. O’Hara,et al.  A review of Bayesian variable selection methods: what, how and which , 2009 .

[40]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[41]  Mengjie Zhang,et al.  Cooperative coevolution of Elman recurrent neural networks for chaotic time series prediction , 2012, Neurocomputing.

[42]  Andrew J. Majda,et al.  Systematic Strategies for Stochastic Mode Reduction in Climate , 2003 .

[43]  Christian P. Robert,et al.  Statistics for Spatio-Temporal Data , 2014 .

[44]  R. L. Winkler,et al.  Scoring Rules for Continuous Probability Distributions , 1976 .

[45]  Sanjay Singh,et al.  Unemployment rates forecasting using supervised neural networks , 2016, 2016 6th International Conference - Cloud System and Big Data Engineering (Confluence).

[46]  Jun S. Liu,et al.  Parameter Expansion for Data Augmentation , 1999 .

[47]  M. Hooten,et al.  Statistical Agent-Based Models for Discrete Spatio-Temporal Systems , 2010 .

[48]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[49]  F. Takens Detecting strange attractors in turbulence , 1981 .

[50]  E. Lorenz Predictability of Weather and Climate: Predictability – a problem partly solved , 2006 .

[51]  A. Barnston,et al.  Skill of Real-Time Seasonal ENSO Model Predictions During 2002–11: Is Our Capability Increasing? , 2012 .

[52]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[53]  E. Lorenz Deterministic nonperiodic flow , 1963 .

[54]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[55]  A. Fedorov,et al.  The extreme El Niño of 2015–2016: the role of westerly and easterly wind bursts, and preconditioning by the failed 2014 event , 2019, Climate Dynamics.

[56]  Christopher K. Wikle,et al.  A model‐based approach for analog spatio‐temporal dynamic forecasting , 2015, 1506.06169.

[57]  Stefan J. Kiebel,et al.  From Birdsong to Human Speech Recognition: Bayesian Inference on a Hierarchy of Nonlinear Dynamical Systems , 2013, PLoS Comput. Biol..

[58]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[59]  Zhe Gan,et al.  Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling , 2016, ACL.

[60]  S. Billings Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio-Temporal Domains , 2013 .

[61]  Qi-Lun Zheng,et al.  Chaotic Time Series Prediction Based on Evolving Recurrent Neural Networks , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[62]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[63]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[64]  D. Wilks Effects of stochastic parametrizations in the Lorenz '96 system , 2005 .

[65]  Nicholas A. Jones,et al.  The Two or More Races Population : 2010 2010 Census , 2012 .

[66]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[67]  Faming Liang,et al.  Bayesian neural networks for nonlinear time series forecasting , 2005, Stat. Comput..

[68]  M. Medeiros,et al.  Linear models, smooth transition autoregressions, and neural networks for forecasting macroeconomic time series: A re-examination , 2005 .

[69]  James P. Hobert,et al.  The data augmentation algorithm : Theory and methodology , 2009 .