Correcting Forecasts with Multifactor Neural Attention

Automatic forecasting of time series data is a challenging problem in many industries. Current forecast models adopted by businesses do not provide adequate means for including data representing external factors that may have a significant impact on the time series, such as weather, national events, local events, social media trends, promotions, etc. This paper introduces a novel neural network attention mechanism that naturally incorporates data from multiple external sources without the feature engineering needed to get other techniques to work. We demonstrate empirically that the proposed model achieves superior performance for predicting the demand of 20 commodities across 107 stores of one of America's largest retailers when compared to other baseline models, including neural networks, linear models, certain kernel methods, Bayesian regression, and decision trees. Our method ultimately accounts for a 23.9% relative improvement as a result of the incorporation of external data sources, and provides an unprecedented level of descriptive ability for a neural network forecasting model.

[1]  Amparo Alonso-Betanzos,et al.  A Very Fast Learning Method for Neural Networks Based on Sensitivity Analysis , 2006, J. Mach. Learn. Res..

[2]  Xiaotie Deng,et al.  Exploiting Topic based Twitter Sentiment for Stock Prediction , 2013, ACL.

[3]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[4]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[5]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[6]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[7]  W. Woon,et al.  Artificial Neural Network-based electricity price forecasting for smart grid deployment , 2012, 2012 International Conference on Computer Systems and Industrial Informatics.

[8]  Zhao Yang Dong,et al.  Neural Network Models for Electricity Market Forecasting , 2005 .

[9]  Zheng Chen,et al.  Study of Stock Prediction Based on Social Network , 2013, 2013 International Conference on Social Computing.

[10]  Juan R. Trapero,et al.  Nonlinear identification of judgmental forecasts effects at SKU level , 2011 .

[11]  Philip Hans Franses,et al.  Properties of expert adjustments on model-based SKU-level forecasts , 2009 .

[12]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[13]  Christopher Joseph Pal,et al.  Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Pedro Sousa,et al.  Multi‐scale Internet traffic forecasting using neural networks and time series methods , 2010, Expert Syst. J. Knowl. Eng..

[15]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[16]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[17]  R. Fildes,et al.  Analysis of judgmental adjustments in the presence of promotions , 2013 .

[18]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[19]  Peter R. Winters,et al.  Forecasting Sales by Exponentially Weighted Moving Averages , 1960 .

[20]  R. Buizza,et al.  Using weather ensemble predictions in electricity demand forecasting , 2003 .

[21]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[22]  Yoshua Bengio,et al.  Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks , 2015, IEEE Transactions on Multimedia.

[23]  Suhartono Suhartono,et al.  Double Seasonal Recurrent Neural Networks for Forecasting Short Term Electricity Load Demand in Indonesia , 2011 .

[24]  P. Goodwin,et al.  Judgmental forecasting: A review of progress over the last 25 years , 2006 .

[25]  R. Fildes,et al.  Effective forecasting and judgmental adjustments: an empirical evaluation and strategies for improvement in supply-chain planning , 2009 .

[26]  Antonio Messineo,et al.  Using Recurrent Artificial Neural Networks to Forecast Household Electricity Consumption , 2012 .

[27]  Martha Starr-McCluer,et al.  The effects of weather on retail sales , 2000 .

[28]  C. Holt Author's retrospective on ‘Forecasting seasonals and trends by exponentially weighted moving averages’ , 2004 .

[29]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[30]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[31]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[32]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .