Sales Demand Forecast in E-commerce using a Long Short-Term Memory Neural Network Methodology

Generating accurate and reliable sales forecasts is crucial in the E-commerce business. The current state-of-the-art techniques are typically univariate methods, which produce forecasts considering only the historical sales data of a single product. However, in a situation where large quantities of related time series are available, conditioning the forecast of an individual time series on past behaviour of similar, related time series can be beneficial. Since the product assortment hierarchy in an E-commerce platform contains large numbers of related products, in which the sales demand patterns can be correlated, our attempt is to incorporate this cross-series information in a unified model. We achieve this by globally training a Long Short-Term Memory network (LSTM) that exploits the non-linear demand relationships available in an E-commerce product assortment hierarchy. Aside from the forecasting framework, we also propose a systematic pre-processing framework to overcome the challenges in the E-commerce business. We also introduce several product grouping strategies to supplement the LSTM learning schemes, in situations where sales patterns in a product portfolio are disparate. We empirically evaluate the proposed forecasting framework on a real-world online marketplace dataset from Walmart.com. Our method achieves competitive results on category level and super-departmental level datasets, outperforming state-of-the-art techniques.

[1]  Matthias W. Seeger,et al.  Bayesian Intermittent Demand Forecasting for Large Inventories , 2016, NIPS.

[2]  Ronald J. Williams,et al.  Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .

[3]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[4]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[5]  Juan R. Trapero,et al.  On the identification of sales forecasting models in the presence of promotions , 2015, J. Oper. Res. Soc..

[6]  Robert Fildes,et al.  Principles of Business Forecasting , 2012 .

[7]  Michael Y. Hu,et al.  Forecasting with artificial neural networks: The state of the art , 1997 .

[8]  Can Wang,et al.  Sales Forecast in E-commerce using Convolutional Neural Network , 2017, ArXiv.

[9]  Sven F. Crone,et al.  Supply Chain Forecasting:Best Practices & Benchmarking Study , 2012 .

[10]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[11]  P. K. Kannan,et al.  Using online search data to forecast new product sales , 2012, Decis. Support Syst..

[12]  J. Randall Brown,et al.  Rational Arithmetic Mathematica Functions to Evaluate the One-sided One-sample K-S Cumulative Sample Distribution , 2007 .

[13]  Tatiana Tommasi,et al.  Training Deep Networks without Learning Rates Through Coin Betting , 2017, NIPS.

[14]  Sander Bohte,et al.  Conditional Time Series Forecasting with Convolutional Neural Networks , 2017, 1703.04691.

[15]  J. Ord,et al.  Forecasting the intermittent demand for slow-moving inventories: A modelling approach , 2012 .

[16]  Evangelos Spiliotis,et al.  The M4 Competition: Results, findings, conclusion and way forward , 2018, International Journal of Forecasting.

[17]  Weizhong Yan,et al.  Toward Automatic Time-Series Forecasting Using Neural Networks , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Benjamin Letham,et al.  Forecasting at Scale , 2018 .

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[21]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[23]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[24]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[25]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[26]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[27]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[28]  K. Torkkola,et al.  A Multi-Horizon Quantile Recurrent Forecaster , 2017, 1711.11053.

[29]  Rob J Hyndman,et al.  Forecasting with Exponential Smoothing: The State Space Approach , 2008 .

[30]  Christoph Bergmeir,et al.  Forecasting across time series databases using recurrent neural networks on groups of similar series: A clustering approach , 2017, Expert Syst. Appl..

[31]  Hans-Georg Zimmermann,et al.  Forecasting with Recurrent Neural Networks: 12 Tricks , 2012, Neural Networks: Tricks of the Trade.

[32]  Souhaib Ben Taieb,et al.  for multi-step time series forecasting , 2011 .

[33]  Nicolas Chapados,et al.  Effective Bayesian Modeling of Groups of Related Count Time Series , 2014, ICML.

[34]  Rob J. Hyndman,et al.  Forecasting with Exponential Smoothing , 2008 .

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[37]  Usha Ramanathan,et al.  Supply chain collaboration for improved forecast accuracy of promotional sales , 2012 .

[38]  Valentin Flunkert,et al.  DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[39]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[40]  Seung-won Hwang,et al.  Browsing2purchase: Online Customer Model for Sales Forecasting in an E-Commerce Site , 2016, WWW.