Click-Through Rate Prediction with Multi-Modal Hypergraphs

Advertising is critical to many online e-commerce platforms such as e-Bay and Amazon. One of the important signals that these platforms rely upon is the click-through rate (CTR) prediction. The recent popularity of multi-modal sharing platforms such as TikTok has led to an increased interest in online micro-videos. It is, therefore, useful to consider micro-videos to help a merchant target micro-video advertising better and find users' favourites to enhance user experience. Existing works on CTR prediction largely exploit unimodal content to learn item representations. A relatively minimal effort has been made to leverage multi-modal information exchange among users and items. We propose a model to exploit the temporal user-item interactions to guide the representation learning with multi-modal features, and further predict the user click rate of the micro-video item. We design a Hypergraph Click-Through Rate prediction framework (HyperCTR) built upon the hyperedge notion of hypergraph neural networks, which can yield modal-specific representations of users and micro-videos to better capture user preferences. We construct a time-aware user-item bipartite network with multi-modal information and enrich the representation of each user and item with the generated interests-based user hypergraph and item hypergraph. Through extensive experiments on three public datasets, we demonstrate that our proposed model significantly outperforms various state-of-the-art methods.

[1]  Li Li,et al.  Deep Spatio-Temporal Neural Networks for Click-Through Rate Prediction , 2019, KDD.

[2]  Alexandros Karatzoglou,et al.  Session-based Recommendations with Recurrent Neural Networks , 2015, ICLR.

[3]  Timothy P. Lillicrap,et al.  Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.

[4]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[5]  Dong Li,et al.  Spam Review Detection with Graph Convolutional Networks , 2019, CIKM.

[6]  Bo Liu,et al.  Heterogeneous Hypergraph Embedding for Graph Classification , 2020, WSDM.

[7]  Jixing Xu,et al.  Gemini: A Novel and Universal Heterogeneous Graph Information Fusing Framework for Online Recommendations , 2020, KDD.

[8]  Weinan Zhang,et al.  User Behavior Retrieval for Click-Through Rate Prediction , 2020, SIGIR.

[9]  Hongxu Chen,et al.  Hyperbolic Hypergraphs for Sequential Recommendation , 2021, CIKM.

[10]  MMGCN , 2019, Proceedings of the 27th ACM International Conference on Multimedia.

[11]  Jun Wang,et al.  User Response Learning for Directly Optimizing Campaign Performance in Display Advertising , 2016, CIKM.

[12]  Max Welling,et al.  Graph Convolutional Matrix Completion , 2017, ArXiv.

[13]  Honglak Lee,et al.  An efficient framework for learning sentence representations , 2018, ICLR.

[14]  Tat-Seng Chua,et al.  Shorter-is-Better: Venue Category Estimation from Micro-Video , 2016, ACM Multimedia.

[15]  Chaoran Cui,et al.  Routing Micro-videos via A Temporal Graph-guided Recommendation System , 2019, ACM Multimedia.

[16]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[17]  Guorui Zhou,et al.  Practice on Long Sequential User Behavior Modeling for Click-Through Rate Prediction , 2019, KDD.

[18]  Julian J. McAuley,et al.  Self-Attentive Sequential Recommendation , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[19]  Haifeng Hu,et al.  Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion , 2019, AAAI.

[20]  Keping Yang,et al.  Deep Session Interest Network for Click-Through Rate Prediction , 2019, IJCAI.

[21]  Bin Shen,et al.  Collaborative Memory Network for Recommendation Systems , 2018, SIGIR.

[22]  Hongzhi Yin,et al.  Temporal Meta-path Guided Explainable Recommendation , 2021, WSDM.

[23]  Weinan Zhang,et al.  Bidding Machine: Learning to Bid for Directly Optimizing Profits in Display Advertising , 2018, IEEE Transactions on Knowledge and Data Engineering.

[24]  Bin Liu,et al.  Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction , 2019, WWW.

[25]  Naonori Ueda,et al.  Higher-Order Factorization Machines , 2016, NIPS.

[26]  Chao Wang,et al.  Adversarial Multimodal Representation Learning for Click-Through Rate Prediction , 2020, WWW.

[27]  Weinan Zhang,et al.  AutoFIS , 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.

[28]  Ji-Rong Wen,et al.  S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization , 2020, CIKM.

[29]  Mohan S. Kankanhalli,et al.  MMALFM , 2018, ACM Trans. Inf. Syst..

[30]  Tat-Seng Chua,et al.  Neural Multimodal Belief Tracker with Adaptive Attention for Dialogue Systems , 2019, WWW.

[31]  Dong Liu,et al.  Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction , 2018, ACM Multimedia.

[32]  Xiangnan He,et al.  MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video , 2019, ACM Multimedia.

[33]  Bin Liu,et al.  AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction , 2020, KDD.

[34]  Chang Zhou,et al.  Deep Interest Evolution Network for Click-Through Rate Prediction , 2018, AAAI.

[35]  Hongzhi Yin,et al.  Reinforced KGs reasoning for explainable sequential recommendation , 2021, World Wide Web.

[36]  Ke Wang,et al.  Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding , 2018, WSDM.

[37]  Yuandong Tian,et al.  Towards Automated Neural Interaction Discovery for Click-Through Rate Prediction , 2020, KDD.

[38]  Hongzhi Yin,et al.  Where are we in embedding spaces? , 2021, KDD.

[39]  Kevin Chen-Chuan Chang,et al.  Subgraph-augmented Path Embedding for Semantic User Search on Heterogeneous Social Network , 2018, WWW.

[40]  Yang Wang,et al.  SPTF: A Scalable Probabilistic Tensor Factorization Model for Semantic-Aware Behavior Prediction , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[41]  Aren Jansen,et al.  CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[42]  Jiuxin Cao,et al.  Multi-level Hyperedge Distillation for Social Linking Prediction on Sparsely Observed Networks , 2021, WWW.

[43]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[44]  Steven C. H. Hoi,et al.  On Effective Personalized Music Retrieval by Exploring Online User Behaviors , 2016, SIGIR.

[45]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jie Tang,et al.  Self-Supervised Learning: Generative or Contrastive , 2020, IEEE Transactions on Knowledge and Data Engineering.