Representation Learning for Predicting Customer Orders

The ability to predict future customer orders is of significant value to retailers in making many crucial operational decisions. Different from next basket prediction or temporal set prediction, which focuses on predicting a subset of items for a single user, this paper aims for the distributional information of future orders, i.e., the possible subsets of items and their frequencies (probabilities), which is required for decisions such as assortment selection for front-end warehouses and capacity evaluation for fulfillment centers. Based on key statistics of a real order dataset from Tmall supermarket, we show the challenges of order prediction. Motivated by our analysis that biased models of order distribution can still help improve the quality of order prediction, we design a generative model to capture the order distribution for customer order prediction. Our model utilizes representation learning to embed items into a Euclidean space and design a highly efficient SGD algorithm to learn the item embeddings. Future order prediction is done by calibrating orders obtained by random walks over the embedding graph. The experiments show that our model outperforms all the existing methods. The benefit of our model is also illustrated with an application to assortment selection for front-end warehouses.

[1]  Lars Schmidt-Thieme,et al.  Factorizing personalized Markov chains for next-basket recommendation , 2010, WWW '10.

[2]  Yanzhi Li,et al.  ASSORTMENT SELECTION FOR A FRONTEND WAREHOUSE: A ROBUST DATA-DRIVEN APPROACH , 2019 .

[3]  Sunil Gupta,et al.  The Shopping Basket: A Model for Multicategory Purchase Incidence Decisions , 1999 .

[4]  David M. Blei,et al.  Exponential Family Embeddings , 2016, NIPS.

[5]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[6]  Bart Goethals,et al.  Frequent Set Mining , 2010, Data Mining and Knowledge Discovery Handbook.

[7]  Pengfei Wang,et al.  Learning Hierarchical Representation Model for NextBasket Recommendation , 2015, SIGIR.

[8]  Devavrat Shah,et al.  A Nonparametric Approach to Modeling Choice with Limited Data , 2009, Manag. Sci..

[9]  Gary J. Russell,et al.  Analysis of cross category dependence in market basket selection , 2000 .

[10]  Feng Yu,et al.  A Dynamic Recurrent Model for Next Basket Recommendation , 2016, SIGIR.

[11]  Weifeng Lv,et al.  Dual Sequential Network for Temporal Sets Prediction , 2020, SIGIR.

[12]  Jose H. Blanchet,et al.  A Markov Chain Approximation to Choice Modeling , 2016, Oper. Res..

[13]  J. G. Macro,et al.  A simulation tool to determine warehouse efficiencies and storage allocations , 2002, Proceedings of the Winter Simulation Conference.

[14]  R. Kaas,et al.  Mean, Median and Mode in Binomial Distributions , 1980 .

[15]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[16]  Ravi Kumar,et al.  A Discrete Choice Model for Subset Selection , 2018, WSDM.

[17]  Florian Heiss,et al.  Discrete Choice Methods with Simulation , 2016 .

[18]  Bart Goethals,et al.  Frequent Itemset Mining for Big Data , 2013, 2013 IEEE International Conference on Big Data.

[19]  Hui Xiong,et al.  Influential seed items recommendation , 2012, RecSys '12.

[20]  Xiangnan He,et al.  Sets2Sets: Learning from Sequential Sets with Neural Networks , 2019, KDD.

[21]  Moshe Ben-Akiva,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1985 .

[22]  Eli Upfal,et al.  Efficient Discovery of Association Rules and Frequent Itemsets through Sampling with Tight Performance Guarantees , 2011, TKDD.

[23]  Bowen Du,et al.  Predicting Temporal Sets with Deep Neural Networks , 2020, KDD.

[24]  Richard M. Karp,et al.  An Optimal Algorithm for Monte Carlo Estimation , 2000, SIAM J. Comput..