MOrdReD: Memory-based Ordinal Regression Deep Neural Networks for Time Series Forecasting

Time series forecasting is ubiquitous in the modern world. Applications range from health care to astronomy, include climate modelling, financial trading and monitoring of critical engineering equipment. To offer value over this range of activities we must have models that not only provide accurate forecasts but that also quantify and adjust their uncertainty over time. Furthermore, such models must allow for multimodal, non-Gaussian behaviour that arises regularly in applied settings. In this work, we propose a novel, end-to-end deep learning method for time series forecasting. Crucially, our model allows the principled assessment of predictive uncertainty as well as providing rich information regarding multiple modes of future data values. Our approach not only provides an excellent predictive forecast, shadowing true future values, but also allows us to infer valuable information, such as the predictive distribution of the occurrence of critical events of interest, accurately and reliably even over long time horizons. We find the method outperforms other state-of-the-art algorithms, such as Gaussian Processes.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Max A. Little,et al.  Highly comparative time-series analysis: the empirical structure of time series and their methods , 2013, Journal of The Royal Society Interface.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[8]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[9]  Ignacio J. Ramirez-Rosado,et al.  Short-Term Power Forecasting Model for Photovoltaic Plants Based on Historical Similarity , 2013 .

[10]  Zoubin Ghahramani,et al.  An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..

[11]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[12]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[13]  Agathe Girard,et al.  Gaussian Processes: Prediction at a Noisy Input and Application to Iterative Multiple-Step Ahead Forecasting of Time-Series , 2003, European Summer School on Multi-AgentControl.

[14]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[15]  Siem Jan Koopman,et al.  Time Series Analysis by State Space Methods , 2001 .

[16]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[19]  N. Huang,et al.  The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , 1998, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[20]  Algirdas Maknickas,et al.  Investigation of financial market prediction by recurrent neural network , 2011 .

[21]  Nikolay Laptev,et al.  Deep and Confident Prediction for Time Series at Uber , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[22]  J. Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM networks , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[23]  Jürgen Schmidhuber,et al.  Applying LSTM to Time Series Predictable through Time-Window Approaches , 2000, ICANN.

[24]  Nicolai Meinshausen,et al.  Quantile Regression Forests , 2006, J. Mach. Learn. Res..

[25]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.