Learning Reliable User Representations from Volatile and Sparse Data to Accurately Predict Customer Lifetime Value

In industry, customer lifetime value (LTV) prediction is a challenging task, since user consumption data is usually volatile, noisy, or sparse. To address these issues, this paper presents a novel Temporal-Structural User Representation (named TSUR) network to predict LTV. We utilize historical revenue time series and user attributes to learn both temporal and structural user representations, respectively. Specifically, the temporal representation is learned with a temporal trend encoder based on a novel multi-channel Discrete Wavelet Transform~(DWT) module, while the structural representation is derived with Graph Attention Network (GAT) on an attribute similarity graph. Furthermore, a novel cluster-alignment regularization method is employed to align and enhance these two kinds of representations. In essence, such a fusion way can be considered as the association of temporal and structural representations in the low-pass representation space, which is also useful to prevent the data noise from being transferred across different views. To our knowledge, it is the first time that temporal and structural user representations are jointly learned for LTV prediction. Extensive offline experiments on two large-scale real-world datasets and online A/B tests have shown the superiority of our approach over a number of competitive baselines.

[1]  Imed Riadh Farah,et al.  Wavelet Transform Application for/in Non-Stationary Time-Series Analysis: A Review , 2019, Applied Sciences.

[2]  Xiaodong He,et al.  A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems , 2015, WWW.

[3]  Seunghak Lee,et al.  More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.

[4]  Hsin-Hung Wu,et al.  A review of the application of RFM model , 2010 .

[5]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Richi Nayak,et al.  A Novel Approach to Learning Consensus and Complementary Information for Multi-View Data Clustering , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[7]  Jun Tang,et al.  Multi-View Active Learning for Video Recommendation , 2019, IJCAI.

[8]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[9]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[10]  David C. Schmittlein,et al.  Counting Your Customers: Who-Are They and What Will They Do Next? , 1987 .

[11]  Tong Zhang,et al.  Learning on Graph with Laplacian Regularization , 2006, NIPS.

[12]  Yun Fu,et al.  Consensus Guided Multi-View Clustering , 2018, ACM Trans. Knowl. Discov. Data.

[13]  K. Torkkola,et al.  A Multi-Horizon Quantile Recurrent Forecaster , 2017, 1711.11053.

[14]  Bhaskar Bhattacharya,et al.  Median of the p Value Under the Alternative Hypothesis , 2002 .

[15]  Hao Wang,et al.  Multi-view clustering: A survey , 2018, Big Data Min. Anal..

[16]  Guokun Lai,et al.  Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[17]  Takanori Maehara,et al.  Revisiting Graph Neural Networks: All We Have is Low-Pass Filters , 2019, ArXiv.

[18]  Rajesh Parekh,et al.  An Engagement-Based Customer Lifetime Value System for E-commerce , 2016, KDD.

[19]  Keping Yang,et al.  M2GRL: A Multi-task Multi-view Graph Representation Learning Framework for Web-scale Recommender Systems , 2020, KDD.

[20]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[21]  Xiao Wang,et al.  Beyond Low-frequency Information in Graph Convolutional Networks , 2021, AAAI.

[22]  DSANet , 2019, Proceedings of the 28th ACM International Conference on Information and Knowledge Management.

[23]  Diego Klabjan,et al.  To be or not to be...social: incorporating simple social features in mobile game customer lifetime value predictions , 2018, ACSW.

[24]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[25]  Ao Tang,et al.  DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting , 2019, CIKM.

[26]  Peter S. Fader,et al.  RFM and CLV: Using Iso-Value Curves for Customer Base Analysis , 2005 .

[27]  Yuan He,et al.  Graph Neural Networks for Social Recommendation , 2019, WWW.

[28]  Edgar Chávez,et al.  Using the k-Nearest Neighbor Graph for Proximity Searching in Metric Spaces , 2005, SPIRE.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Nicolas Chapados,et al.  N-BEATS: Neural basis expansion analysis for interpretable time series forecasting , 2019, ICLR.

[31]  Jiawei Han,et al.  STFNets: Learning Sensing Signals from the Time-Frequency Perspective with Short-Time Fourier Neural Networks , 2019, WWW.

[32]  Kilian Q. Weinberger,et al.  Simplifying Graph Convolutional Networks , 2019, ICML.

[33]  Jing Jiang,et al.  Graph WaveNet for Deep Spatial-Temporal Graph Modeling , 2019, IJCAI.

[34]  Dr. L. Arockiam,et al.  A SURVEY ON ARIMA FORECASTING USING TIME SERIES MODEL , 2016 .

[35]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[36]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[37]  Anirban Dasgupta,et al.  Fast locality-sensitive hashing , 2011, KDD.

[38]  Junjie Wu,et al.  Multilevel Wavelet Decomposition Network for Interpretable Time Series Analysis , 2018, KDD.

[39]  Shouling Ji,et al.  TiSSA: A Time Slice Self-Attention Approach for Modeling Sequential User Behaviors , 2019, WWW.

[40]  Mohammad A. Al-Fawzan,et al.  Time Series Forecasting Using Wavelet Denoising an Application to Saudi Stock Index , 2002 .

[41]  Dominique M. Hanssens,et al.  Modeling Customer Lifetime Value , 2006 .

[42]  Max Welling,et al.  Graph Convolutional Matrix Completion , 2017, ArXiv.

[43]  Shiliang Sun,et al.  Multi-View Support Vector Machines with the Consensus and Complementarity Information , 2020, IEEE Transactions on Knowledge and Data Engineering.

[44]  G. Thomas Friedlob,et al.  Understanding Return on Investment , 1996 .

[45]  Ana Fernández del Río,et al.  Customer Lifetime Value in Video Games Using Deep Learning and Parametric Models , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[46]  Leandro Pardo,et al.  THE JENSEN-SHANNON DIVERGENCE , 1997 .

[47]  Tieniu Tan,et al.  Multi-view Clustering via Structured Low-rank Representation , 2015, CIKM.