Long-term Forecasting using Higher Order Tensor RNNs

We present Higher-Order Tensor RNN (HOT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics. Long-term forecasting in such systems is highly challenging, since there exist long-term temporal dependencies, higher-order correlations and sensitivity to error propagation. Our proposed recurrent architecture addresses these issues by learning the nonlinear dynamics directly using higher-order moments and higher-order state transition functions. Furthermore, we decompose the higher-order structure using the tensor-train decomposition to reduce the number of parameters while preserving the model performance. We theoretically establish the approximation guarantees and the variance bound for HOT-RNN for general sequence inputs. We also demonstrate 5% ~ 12% improvements for long-term prediction over general RNN and LSTM architectures on a range of simulated environments with nonlinear dynamics, as well on real-world time series data.

[1]  Petre Stoica,et al.  Decentralized Control , 2018, The Control Systems Handbook.

[2]  Elina Robeva,et al.  Duality of Graphical Models and Tensor Networks , 2017, Information and Inference: A Journal of the IMA.

[3]  Yoshua Bengio,et al.  Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.

[4]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[5]  Thomas Hofmann,et al.  Exponential Families for Conditional Random Fields , 2004, UAI.

[6]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7]  Roman Orus,et al.  A Practical Introduction to Tensor Networks: Matrix Product States and Projected Entangled Pair States , 2013, 1306.2164.

[8]  Hui Jiang,et al.  Higher Order Recurrent Neural Networks , 2016, ArXiv.

[9]  David J. Schwab,et al.  Supervised Learning with Tensor Networks , 2016, NIPS.

[10]  Le Song,et al.  Nonparametric Estimation of Multi-View Latent Variable Models , 2013, ICML.

[11]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[12]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[13]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[14]  Nadav Cohen,et al.  On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[15]  Matthias W. Seeger,et al.  Deep State Space Models for Time Series Forecasting , 2018, NeurIPS.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Valentin Flunkert,et al.  DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[18]  Jürgen Schmidhuber,et al.  Learning to Reason with Third-Order Tensor Products , 2018, NeurIPS.

[19]  C. Lee Giles,et al.  Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[20]  Alexander Novikov,et al.  Tensorizing Neural Networks , 2015, NIPS.

[21]  Yisong Yue,et al.  Generating Long-term Trajectories Using Deep Hierarchical Networks , 2016, NIPS.

[22]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[23]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[24]  Geoffrey J. Gordon,et al.  Practical Learning of Predictive State Representations , 2017, ArXiv.

[25]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[26]  Youssef M. Marzouk,et al.  Spectral Tensor-Train Decomposition , 2014, SIAM J. Sci. Comput..

[27]  Alan Genz,et al.  Testing multidimensional integration routines , 1984 .

[28]  Hagen Soltau,et al.  Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.

[29]  Rose Yu,et al.  Learning from Multiway Data: Simple and Efficient Tensor Regression , 2016, ICML.

[30]  Anima Anandkumar,et al.  A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[31]  Volker Tresp,et al.  Tensor-Train Recurrent Neural Networks for Video Classification , 2017, ICML.

[32]  Ivan Oseledets,et al.  Expressive power of recurrent neural networks , 2017, ICLR.

[33]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[34]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[35]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..