Modeling Sequential Recommendation as Missing Information Imputation

Side information is being used extensively to improve the effectiveness of sequential recommendation models. It is said to help capture the transition patterns among items. Most previous work on sequential recommendation that uses side information models item IDs and side information separately, which may fail to fully model the relation between the items and their side information. Moreover, in real-world systems, not all values of item feature fields are available. This hurts the performance of models that rely on side information. Existing methods tend to neglect the context of missing item feature fields, and fill them with generic or special values, e.g., unknown, which might lead to sub-optimal performance. To address the limitation of sequential recommenders with side information, we define a way to fuse side information and alleviate the problem of missing side information by proposing a unified task, namely the missing information imputation (MII), which randomly masks some feature fields in a given sequence of items, including item IDs, and then forces a predictive model to recover them. By considering the next item as a missing feature field, sequential recommendation can be formulated as a special case of MII. We propose a sequential recommendation model, called missing information imputation recommender (MIIR), that builds on the idea of MII and simultaneously imputes missing item feature values and predicts the next item. We devise a dense fusion self-attention (DFSA) mechanism for MIIR to capture all pairwise relations between items and their side information. Empirical studies on three benchmark datasets demonstrate that MIIR, supervised by MII, achieves a significantly better sequential recommendation performance than state-of-the-art baselines.

[1]  Peilin Zhou,et al.  Decoupled Side Information Fusion for Sequential Recommendation , 2022, SIGIR.

[2]  Eliyahu Kiperwasser,et al.  Sequential Modeling with Multiple Attributes for Watchlist Recommendation in E-Commerce , 2021, WSDM.

[3]  Gabriel de Souza Pereira Moreira,et al.  Transformers4Rec: Bridging the Gap between NLP and Sequential / Session-Based Recommendation , 2021, RecSys.

[4]  Renqin Cai,et al.  Category-aware Collaborative Sequential Recommendation , 2021, SIGIR.

[5]  Lei Shi,et al.  ICAI-SR: Item Categorical Attribute Integrated Sequential Recommendation , 2021, SIGIR.

[6]  Xiaoguang Li,et al.  Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation , 2021, AAAI.

[7]  Glenn M. Fung,et al.  Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention , 2021, AAAI.

[8]  Maosong Sun,et al.  Knowledge Transfer via Pre-training for Recommendation: A Review and Prospect , 2020, Frontiers in Big Data.

[9]  Miriam Seoane Santos,et al.  Reviewing Autoencoders for Missing Data Imputation: Technical Trends, Applications and Outcomes , 2020, J. Artif. Intell. Res..

[10]  Ji-Rong Wen,et al.  S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization , 2020, CIKM.

[11]  Yu Fan,et al.  KERL: A Knowledge-Guided Reinforcement Learning Model for Sequential Recommendation , 2020, SIGIR.

[12]  Han Fang,et al.  Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.

[13]  Yanjie Fu,et al.  Joint Item Recommendation and Attribute Inference: An Adaptive Graph Convolutional Network Approach , 2020, SIGIR.

[14]  Vladlen Koltun,et al.  Exploring Self-Attention for Image Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yiqun Liu,et al.  Adaptive Feature Sampling for Recommendation with Missing Content Feature Values , 2019, CIKM.

[16]  Jianmo Ni,et al.  Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects , 2019, EMNLP.

[17]  Deqing Wang,et al.  Feature-level Deeper Self-Attention Network for Sequential Recommendation , 2019, IJCAI.

[18]  Guibing Guo,et al.  Deep Learning for Sequential Recommendation: Algorithms, Influential Factors, and Evaluations , 2019 .

[19]  Peng Jiang,et al.  BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer , 2019, CIKM.

[20]  Yixin Cao,et al.  Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences , 2019, WWW.

[21]  Xing Xie,et al.  Session-based Recommendation with Graph Neural Networks , 2018, AAAI.

[22]  Xiaodong Liu,et al.  Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension , 2018, NAACL.

[23]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[24]  Felix Bießmann,et al.  "Deep" Learning for Missing Value Imputationin Tables with Non-Numerical Data , 2018, CIKM.

[25]  Changsheng Xu,et al.  CSAN: Contextual Self-Attention Network for User Sequential Recommendation , 2018, ACM Multimedia.

[26]  Razvan Pascanu,et al.  Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.

[27]  Rico Sennrich,et al.  Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.

[28]  Cheng Wang,et al.  LRMM: Learning to Recommend with Missing Modalities , 2018, EMNLP.

[29]  Julian J. McAuley,et al.  Self-Attentive Sequential Recommendation , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[30]  Xing Xie,et al.  How to Impute Missing Ratings?: Claims, Solution, and Its Application to Collaborative Filtering , 2018, WWW.

[31]  Ke Wang,et al.  Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding , 2018, WSDM.

[32]  Alexandros Karatzoglou,et al.  Recurrent Neural Networks with Top-k Gains for Session-based Recommendations , 2017, CIKM.

[33]  Zhaochun Ren,et al.  Neural Attentive Session-based Recommendation , 2017, CIKM.

[34]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[35]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[36]  et al.,et al.  Missing Data Imputation in the Electronic Health Record Using Deeply Learned Autoencoders , 2017, PSB.

[37]  Alexandros Karatzoglou,et al.  Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations , 2016, RecSys.

[38]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[39]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .

[40]  Alexandros Karatzoglou,et al.  Session-based Recommendations with Recurrent Neural Networks , 2015, ICLR.

[41]  Alexander Binder,et al.  Layer-Wise Relevance Propagation for Deep Neural Network Architectures , 2016 .

[42]  Ke Lu,et al.  Missing data imputation by K nearest neighbours based on grey relational structure and mutual information , 2015, Applied Intelligence.

[43]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[44]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[45]  Zoubin Ghahramani,et al.  Probabilistic Matrix Factorization with Non-random Missing Data , 2014, ICML.

[46]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[47]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[48]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[49]  Richard S. Zemel,et al.  Collaborative prediction and ranking with non-random missing data , 2009, RecSys '09.

[50]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.