Sequential Recommendation with User Causal Behavior Discovery

The key of sequential recommendation lies in the accurate item correlation modeling. Previous models infer such information based on item co-occurrences, which may fail to capture the real causal relations, and impact the recommendation performance and explainability. In this paper, we equip sequential recommendation with a novel causal discovery module to capture causalities among user behaviors. Our general idea is firstly assuming a causal graph underlying item correlations, and then we learn the causal graph jointly with the sequential recommender model by fitting the real user behavior data. More specifically, in order to satisfy the causality requirement, the causal graph is regularized by a differentiable directed acyclic constraint. Considering that the number of items in recommender systems can be very large, we represent different items with a unified set of latent clusters, and the causal graph is defined on the cluster level, which enhances the model scalability and robustness. In addition, we provide theoretical analysis on the identifiability of the learned causal graph. To the best of our knowledge, this paper makes a first step towards combining sequential recommendation with causal discovery. For evaluating the recommendation performance, we implement our framework with different neural sequential architectures, and compare them with many state-of-the-art methods based on real-world datasets. Empirical studies manifest that our model can on average improve the performance by about 6.1% and 11.3% on F1 and NDCG, respectively. To evaluate the model explainability, we build a new dataset with human labeled explanations for both quantitative and qualitative analysis.

[1]  Balázs Hidasi,et al.  Recurrent Neural Networks , 2021, Computer Science Today.

[2]  Yongfeng Zhang,et al.  EXTRA: Explanation Ranking Datasets for Explainable Recommendation , 2021, SIGIR.

[3]  Shaozhang Niu,et al.  Sequential Recommendation with a Pre-trained Module Learning Multi-modal Information , 2020, 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics).

[4]  Alexandre Lacoste,et al.  Differentiable Causal Discovery from Interventional Data , 2020, NeurIPS.

[5]  Zhitang Chen,et al.  A Graph Autoencoder Approach to Causal Structure Learning , 2019, ArXiv.

[6]  Peng Jiang,et al.  BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer , 2019, CIKM.

[7]  Julian J. McAuley,et al.  Self-Attentive Sequential Recommendation , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[8]  Qiao Liu,et al.  STAMP: Short-Term Attention/Memory Priority Model for Session-based Recommendation , 2018, KDD.

[9]  Pradeep Ravikumar,et al.  DAGs with NO TEARS: Continuous Optimization for Structure Learning , 2018, NeurIPS.

[10]  Zhaochun Ren,et al.  Neural Attentive Session-based Recommendation , 2017, CIKM.

[11]  Alexandros Karatzoglou,et al.  Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks , 2017, RecSys.

[12]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[13]  Liang Wang,et al.  A Visual and Textual Recurrent Neural Network for Sequential Prediction , 2016, ArXiv.

[14]  Bo Yang,et al.  Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering , 2016, ICML.

[15]  Dhruv Batra,et al.  Joint Unsupervised Learning of Deep Representations and Image Clusters , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[17]  René Vidal,et al.  Structured Low-Rank Matrix Factorization: Optimality, Algorithm, and Applications to Image Processing , 2014, ICML.

[18]  Yoram Singer,et al.  Local Low-Rank Matrix Approximation , 2013, ICML.

[19]  Lars Schmidt-Thieme,et al.  Factorizing personalized Markov chains for next-basket recommendation , 2010, WWW '10.

[20]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[21]  Jiji Zhang,et al.  On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias , 2008, Artif. Intell..

[22]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[23]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[24]  David Maxwell Chickering,et al.  Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables , 1997, Machine Learning.

[25]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[26]  Christopher Meek,et al.  Causal inference and causal explanation with background knowledge , 1995, UAI.

[27]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[28]  Remco R. Bouckaert,et al.  Probalistic Network Construction Using the Minimum Description Length Principle , 1993, ECSQARU.

[29]  J. Rychlak Logical learning theory: a teleological alternative in the field of personality , 1986 .

[30]  Fredric C. Gey,et al.  The Relationship between Recall and Precision , 1994, J. Am. Soc. Inf. Sci..

[31]  J. Rychlak,et al.  Logical Learning Theory: A Human Teleology and its Empirical Support , 1994 .