Period-aware content attention RNNs for time series forecasting with missing values

Abstract Recurrent neural networks (RNNs) recently received considerable attention for sequence modeling and time series analysis. Many time series contain periods, e.g. seasonal changes in weather time series or electricity usage at day and night time. Here, we first analyze the behavior of RNNs with an attention mechanism with respect to periods in time series and illustrate that they fail to model periods. Then, we propose an extended attention model for sequence-to-sequence RNNs designed to capture periods in time series with or without missing values. This extended attention model can be deployed on top of any RNN, and is shown to yield state-of-the-art performance for time series forecasting on several univariate and multivariate time series.

[1]  Jürgen Schmidhuber,et al.  Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[2]  G. T. Walker On Periodicity in Series of Related Terms , 1931 .

[3]  Gianluca Bontempi,et al.  Machine Learning Strategies for Time Series Forecasting , 2012, eBISS.

[4]  G. C. Tiao,et al.  Modeling Multiple Time Series with Applications , 1981 .

[5]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[6]  Sven F. Crone,et al.  Advances in forecasting with neural networks? Empirical evidence from the NN3 competition on time series prediction , 2011 .

[7]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[8]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[9]  Harald Haas,et al.  Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.

[10]  Matthew Scotch,et al.  Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks , 2014, BMC Bioinformatics.

[11]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[12]  Amy Loutfi,et al.  A review of unsupervised feature learning and deep learning for time-series modeling , 2014, Pattern Recognit. Lett..

[13]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[14]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[15]  Richard Hull,et al.  Correcting Forecasts with Multifactor Neural Attention , 2016, ICML.

[16]  Peter Szolovits,et al.  A Multivariate Timeseries Modeling Approach to Severity of Illness Assessment and Forecasting in ICU with Sparse, Heterogeneous Clinical Data , 2015, AAAI.

[17]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[18]  John F. MacGregor,et al.  Some Recent Advances in Forecasting and Control , 1968 .

[19]  Ping Li,et al.  Dynamic Least Squares Support Vector Machine , 2006, 2006 6th World Congress on Intelligent Control and Automation.

[20]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[21]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[22]  Jürgen Schmidhuber,et al.  Applying LSTM to Time Series Predictable through Time-Window Approaches , 2000, ICANN.

[23]  Guoqiang Peter Zhang,et al.  An investigation of neural networks for linear time-series forecasting , 2001, Comput. Oper. Res..

[24]  Rob J Hyndman,et al.  25 years of time series forecasting , 2006 .

[25]  Andrew Kusiak,et al.  A data-mining approach to predict influent quality , 2013, Environmental Monitoring and Assessment.

[26]  G. Yule On a Method of Investigating Periodicities in Disturbed Series, with Special Reference to Wolfer's Sunspot Numbers , 1927 .

[27]  A. Lapedes,et al.  Nonlinear signal processing using neural networks: Prediction and system modelling , 1987 .

[28]  Eugen Slutzky Summation of random causes as the source of cyclic processes , 1937 .

[29]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[30]  Wei-Chang Yeh,et al.  Forecasting stock markets using wavelet transforms and recurrent neural networks: An integrated system based on artificial bee colony algorithm , 2011, Appl. Soft Comput..

[31]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[32]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[33]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[34]  Ah Chung Tsoi,et al.  Noisy Time Series Prediction using Recurrent Neural Networks and Grammatical Inference , 2001, Machine Learning.

[35]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .

[36]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[37]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[38]  Les E. Atlas,et al.  Recurrent Networks and NARMA Modeling , 1991, NIPS.

[39]  Gunnar Rätsch,et al.  Predicting Time Series with Support Vector Machines , 1997, ICANN.

[40]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[41]  Jason Weston,et al.  Memory Networks , 2014, ICLR.