论文信息 - Long-term Forecasting using Higher Order Tensor RNNs - 字舞流文

Long-term Forecasting using Higher Order Tensor RNNs

We present Higher-Order Tensor RNN (HOT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics. Long-term forecasting in such systems is highly challenging, since there exist long-term temporal dependencies, higher-order correlations and sensitivity to error propagation. Our proposed recurrent architecture addresses these issues by learning the nonlinear dynamics directly using higher-order moments and higher-order state transition functions. Furthermore, we decompose the higher-order structure using the tensor-train decomposition to reduce the number of parameters while preserving the model performance. We theoretically establish the approximation guarantees and the variance bound for HOT-RNN for general sequence inputs. We also demonstrate 5% ~ 12% improvements for long-term prediction over general RNN and LSTM architectures on a range of simulated environments with nonlinear dynamics, as well on real-world time series data.

Yisong Yue | Anima Anandkumar | Rose Yu | Stephan Zheng | Yisong Yue | Anima Anandkumar | Stephan Zheng | Rose Yu

[1] Petre Stoica,et al. Decentralized Control , 2018, The Control Systems Handbook.

[2] Elina Robeva,et al. Duality of Graphical Models and Tensor Networks , 2017, Information and Inference: A Journal of the IMA.

[3] Yoshua Bengio,et al. Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.

[4] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[5] Thomas Hofmann,et al. Exponential Families for Conditional Random Fields , 2004, UAI.

[6] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7] Roman Orus,et al. A Practical Introduction to Tensor Networks: Matrix Product States and Projected Entangled Pair States , 2013, 1306.2164.

[8] Hui Jiang,et al. Higher Order Recurrent Neural Networks , 2016, ArXiv.

[9] David J. Schwab,et al. Supervised Learning with Tensor Networks , 2016, NIPS.

[10] Le Song,et al. Nonparametric Estimation of Multi-View Latent Variable Models , 2013, ICML.

[11] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.

[12] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[13] Gwilym M. Jenkins,et al. Time series analysis, forecasting and control , 1971 .

[14] Nadav Cohen,et al. On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[15] Matthias W. Seeger,et al. Deep State Space Models for Time Series Forecasting , 2018, NeurIPS.

[16] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[17] Valentin Flunkert,et al. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[18] Jürgen Schmidhuber,et al. Learning to Reason with Third-Order Tensor Products , 2018, NeurIPS.

[19] C. Lee Giles,et al. Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[20] Alexander Novikov,et al. Tensorizing Neural Networks , 2015, NIPS.

[21] Yisong Yue,et al. Generating Long-term Trajectories Using Deep Hierarchical Networks , 2016, NIPS.

[22] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[23] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[24] Geoffrey J. Gordon,et al. Practical Learning of Predictive State Representations , 2017, ArXiv.

[25] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[26] Youssef M. Marzouk,et al. Spectral Tensor-Train Decomposition , 2014, SIAM J. Sci. Comput..

[27] Alan Genz,et al. Testing multidimensional integration routines , 1984 .

[28] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.

[29] Rose Yu,et al. Learning from Multiway Data: Simple and Efficient Tensor Regression , 2016, ICML.

[30] Anima Anandkumar,et al. A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[31] Volker Tresp,et al. Tensor-Train Recurrent Neural Networks for Video Classification , 2017, ICML.

[32] Ivan Oseledets,et al. Expressive power of recurrent neural networks , 2017, ICLR.

[33] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.

[34] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[35] Ivan Oseledets,et al. Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..