Retail sales forecasting with meta-learning

Abstract Retail sales forecasting often requires forecasts for thousands of products for many stores. We present a meta-learning framework based on newly developed deep convolutional neural networks, which can first learn a feature representation from raw sales time series data automatically, and then link the learnt features with a set of weights which are used to combine a pool of base-forecasting methods. The experiments which are based on IRI weekly data show that the proposed meta-learner provides superior forecasting performance compared with a number of state-of-art benchmarks, though the accuracy gains over some more sophisticated meta ensemble benchmarks are modest and the learnt features lack interpretability. When designing a meta-learner in forecasting retail sales, we recommend building a pool of base-forecasters including both individual and pooled forecasting methods, and target finding the best combination forecasts instead of the best individual method.

[1]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[2]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[3]  Robert Fildes,et al.  Retail forecasting: Research and practice , 2019 .

[4]  Fatma Kalaoglu,et al.  KONFEKSİYON ENDÜSTRİSİNDE PERAKENDE TALEP TAHMİNLEMESİ , 2015 .

[5]  N. Arunraj,et al.  A hybrid seasonal autoregressive integrated moving average and quantile regression for daily food sales forecasting , 2015 .

[6]  Andreas Ziegler,et al.  ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R , 2015, 1508.04409.

[7]  Robert Fildes,et al.  The value of competitive information in forecasting FMCG retail product sales and the variable selection problem , 2013, Eur. J. Oper. Res..

[8]  Bogdan Gabrys,et al.  Meta-learning for time series forecasting and forecast combination , 2010, Neurocomputing.

[9]  Robert Fildes,et al.  Forecasting third-party mobile payments with implications for customer flow prediction , 2020, International Journal of Forecasting.

[10]  Wing-Keung Wong,et al.  A hybrid intelligent model for medium-term sales forecasting in fashion retail supply chains using extreme learning machine and harmony search algorithm , 2010 .

[11]  Alexandros Kalousis,et al.  NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection , 1999, Intell. Data Anal..

[12]  Gianni Di Pillo,et al.  An application of support vector machines to sales forecasting under promotions , 2016, 4OR.

[13]  Philip Hans Franses,et al.  The M3 competition: Statistical tests of the results , 2005 .

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Fotios Petropoulos,et al.  The M4 competition: Conclusions , 2020 .

[16]  Evangelos Spiliotis,et al.  The M4 Competition: 100,000 time series and 61 forecasting methods , 2020 .

[17]  Jan Fransoo,et al.  SKU demand forecasting in the presence of promotions , 2009, Expert Syst. Appl..

[18]  Andreas Ziegler,et al.  Mining data with random forests: current options for real‐world applications , 2014, WIREs Data Mining Knowl. Discov..

[19]  Nicolau Santos,et al.  Performance of state space and ARIMA models for consumer retail sales forecasting , 2015 .

[20]  VerikasA.,et al.  Mining data with random forests , 2011 .

[21]  Teresa Bernarda Ludermir,et al.  Meta-learning approaches to selecting time series models , 2004, Neurocomputing.

[22]  Slawek Smyl,et al.  A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting , 2020, International Journal of Forecasting.

[23]  Richard Weber,et al.  Improved supply chain management based on hybrid demand forecasts , 2007, Appl. Soft Comput..

[24]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Nikolaos Kourentzes,et al.  Distributions of forecasting errors of forecast combinations: Implications for inventory management , 2016 .

[26]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[27]  Luís Torgo,et al.  Arbitrated Ensemble for Time Series Forecasting , 2017, ECML/PKDD.

[28]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[29]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[30]  Valentin Flunkert,et al.  DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[31]  Rick L. Andrews,et al.  Estimating the SCAN*PRO Model of Store Sales: HB, FM or just OLS? , 2008 .

[32]  J. D. Hess,et al.  Emerging trends in retail pricing practice: implications for research , 2007 .

[33]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics (e1071), TU Wien , 2014 .

[34]  Özden Gür Ali,et al.  Selecting rows and columns for training support vector regression models with large retail datasets , 2013, Eur. J. Oper. Res..

[35]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[36]  Robert C. Blattberg,et al.  Shrinkage Estimation of Price and Promotional Elasticities: Seemingly Unrelated Equations , 1991 .

[37]  Chi-Jie Lu,et al.  Sales forecasting of computer products based on variable selection scheme and support vector regression , 2014, Neurocomputing.

[38]  John E. Boylan,et al.  Reproducibility in forecasting research , 2015 .

[39]  Robert Fildes,et al.  Demand forecasting with high dimensional data: The case of SKU retail sales forecasting with intra- and inter-category promotional information , 2016, Eur. J. Oper. Res..

[40]  Alex Graves,et al.  Neural Machine Translation in Linear Time , 2016, ArXiv.

[41]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[42]  Yong Yu,et al.  An intelligent fast sales forecasting model for fashion products , 2011, Expert Syst. Appl..

[43]  Fotios Petropoulos,et al.  An evaluation of simple versus complex selection rules for forecasting many time series , 2014 .

[44]  Min Xia,et al.  Fashion retailing forecasting based on extreme learning machine with adaptive metrics of inputs , 2012, Knowl. Based Syst..

[45]  Thomas L. Ainscough,et al.  An empirical investigation and comparison of neural networks and regression for scanner data analysis , 1999 .

[46]  Andrea Fumi,et al.  Fourier Analysis for Demand Forecasting in a Fashion Company , 2013 .

[47]  Bart J. Bronnenberg,et al.  Database Paper - The IRI Marketing Data Set , 2008, Mark. Sci..

[48]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[49]  R. Fildes,et al.  Measuring forecasting accuracy : the case of judgmental adjustments to SKU-level demand forecasts , 2013 .

[50]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[51]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[52]  Xiaozhe Wang,et al.  Rule induction for forecasting method selection: Meta-learning the characteristics of univariate time series , 2009, Neurocomputing.

[53]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[54]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  K. V. Donselaar,et al.  How to use aggregation and combined forecasting to improve seasonal demand forecasts , 2004 .

[56]  Stefan Lang,et al.  Accommodating heterogeneity and nonlinearity in price effects for predicting brand sales and profits , 2015, Eur. J. Oper. Res..

[57]  Robert M. Kunst Econometric Forecasting , 2007 .

[58]  Giulio Zotteri,et al.  A model for selecting the appropriate level of aggregation in forecasting processes , 2007 .