Multivariate Arrival Times with Recurrent Neural Networks for Personalized Demand Forecasting

Access to a large variety of data across a massive population has made it possible to predict customer purchase patterns and responses to marketing campaigns. In particular, accurate demand forecasts for popular products with frequent repeat purchases are essential since these products are one of the main drivers of profits. However, buyer purchase patterns are extremely diverse and sparse on a per-product level due to population heterogeneity as well as dependence in purchase patterns across product categories. Traditional methods in survival analysis have proven effective in dealing with censored data by assuming parametric distributions on inter-arrival times. Distributional parameters are then fitted, typically in a regression framework. On the other hand, neural-network based models take a non-parametric approach to learn relations from a larger functional class. However, the lack of distributional assumptions make it difficult to model partially observed data. In this paper, we model directly the inter-arrival times as well as the partially observed information at each time step in a survival-based approach using Recurrent Neural Networks (RNN) to model purchase times jointly over several products. Instead of predicting a point estimate for inter-arrival times, the RNN outputs parameters that define a distributional estimate. The loss function is the negative log-likelihood of these parameters given partially observed data. This approach allows one to leverage both fully observed data as well as partial information. By externalizing the censoring problem through a log-likelihood loss function, we show that substantial improvements over state-of-the-art machine learning methods can be achieved. We present experimental results based on two open datasets as well as a study on a real dataset from a large retailer.

[1]  Abhinav Saxena,et al.  Damage propagation modeling for aircraft engine run-to-failure simulation , 2008, 2008 International Conference on Prognostics and Health Management.

[2]  C. Reeves,et al.  Beyond The Cox Model : Artificial Neural Networks For Survival Analysis Part II , 2006 .

[3]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[4]  Linxia Liao,et al.  Combining Deep Learning and Survival Analysis for Asset Health Management , 2020, International Journal of Prognostics and Health Management.

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  Egil Martinsson,et al.  WTTE-RNN : Weibull Time To Event Recurrent Neural Network A model for sequential prediction of time-to-event in the case of discrete or continuous censored data, recurrent events or time-varying covariates , 2017 .

[7]  Jun Yan Survival Analysis: Techniques for Censored and Truncated Data , 2004 .

[8]  Dirk Van den Poel,et al.  Investigating the role of product features in preventing customer churn, by using survival analysis and choice modeling: The case of financial services , 2004, Expert Syst. Appl..

[9]  Uri Shaham,et al.  DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network , 2016, BMC Medical Research Methodology.

[10]  David C. Yen,et al.  Applying data mining to telecom churn management , 2006, Expert Syst. Appl..

[11]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[12]  Abhinav Saxena,et al.  Performance Benchmarking and Analysis of Prognostic Methods for CMAPSS Datasets , 2020, International Journal of Prognostics and Health Management.

[13]  David C. Schmittlein,et al.  Counting Your Customers: Who-Are They and What Will They Do Next? , 1987 .

[14]  Maxim Finkelstein,et al.  Failure Rate Modelling for Reliability and Risk , 2008 .

[15]  M. Pagano,et al.  Survival analysis. , 1996, Nutrition.

[16]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[17]  Peter S. Fader,et al.  Counting Your Customers the Easy Way: An Alternative to the Pareto/NBD Model , 2005 .

[18]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[19]  R. Kay The Analysis of Survival Data , 2012 .

[20]  Christian Bauckhage,et al.  Predicting Purchase Decisions in Mobile Free-to-Play Games , 2015, AIIDE.

[21]  Valentin Flunkert,et al.  DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[22]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.