Learning and Fusing Multiple User Interest Representations for Micro-Video and Movie Recommendations

Deep learning is known to be effective at automating the generation of representations, which eliminates the need for handcrafted features. For the task of personalized recommendation, deep learning-based methods have achieved great success by learning efficient representations of multimedia items, especially images and videos. Previous works usually adopt simple, single-modality representations of user interest, such as user embeddings, which cannot fully characterize the diversity and volatility of user interest. To address this problem, in this paper we focus on learning and fusing multiple kinds of user interest representations by leveraging deep networks. Specifically, we consider efficient representations of four aspects of user interest: first, we use latent representation, i.e. user embedding, to profile the overall interest; second, we propose item-level representation, which is learned from and integrates the features of a user's historical items; third, we investigate neighbor-assisted representation, i.e. using neighboring users’ information to characterize user interest collaboratively; fourth, we propose category-level representation, which is learned from the categorical attributes of a user's historical items. In order to integrate these multiple user interest representations, we study both early fusion and late fusion; where for early fusion, we study different fusion functions. We validate the proposed method on two real-world video recommendation datasets for micro-video and movie recommendations, respectively. Experimental results demonstrate that our method outperforms existing state-of-the-arts by a significant margin. Our code is publicly available.

[1]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[2]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[3]  Shankar Kumar,et al.  Video suggestion and discovery for youtube: taking random walks through the view graph , 2008, WWW.

[4]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[5]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[6]  Yu He,et al.  The YouTube video recommendation system , 2010, RecSys '10.

[7]  Jonghun Park,et al.  Online Video Recommendation through Tag-Cloud Aggregation , 2011, IEEE MultiMedia.

[8]  Tao Mei,et al.  Contextual Video Recommendation by Multimodal Relevance and User Feedback , 2011, TOIS.

[9]  Zhoujun Li,et al.  Integrating rich information for video recommendation with multi-task rank aggregation , 2011, ACM Multimedia.

[10]  Tao Mei,et al.  Personalized video recommendation through tripartite graph propagation , 2012, ACM Multimedia.

[11]  Haohong Wang,et al.  VideoTopic: Content-Based Video Recommendation Using a Topic Model , 2013, 2013 IEEE International Symposium on Multimedia.

[12]  Zhou Su,et al.  What Videos Are Similar with You?: Learning a Common Attributed Representation for Video Recommendation , 2014, ACM Multimedia.

[13]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[14]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.

[15]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Yanchun Zhang,et al.  Online Video Recommendation in Sharing Community , 2015, SIGMOD Conference.

[18]  Changsheng Xu,et al.  Unified YouTube Video Recommendation via Cross-network Collaboration , 2015, ICMR.

[19]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20]  Yanxiang Huang,et al.  Real-time Video Recommendation Exploration , 2016, SIGMOD Conference.

[21]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[22]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[23]  C. Gomez-Uribe,et al.  The Netflix Recommender System: Algorithms, Business Value, and Innovation , 2016, ACM Trans. Manag. Inf. Syst..

[24]  Franca Garzotto,et al.  Content-Based Video Recommendation System Based on Stylistic Visual Features , 2016, Journal on Data Semantics.

[25]  Alberto Del Bimbo,et al.  Item-Based Video Recommendation: An Hybrid Approach considering Human Factors , 2016, ICMR.

[26]  Tat-Seng Chua,et al.  Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model , 2016, ACM Multimedia.

[27]  Ting Liu,et al.  Consensus Attention-based Neural Networks for Chinese Reading Comprehension , 2016, COLING.

[28]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[29]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[30]  Xiangnan He,et al.  Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention , 2017, SIGIR.

[31]  Changsheng Xu,et al.  A Unified Personalized Video Recommendation via Dynamic Recurrent Neural Networks , 2017, ACM Multimedia.

[32]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[33]  Chenliang Xu,et al.  Dancelets Mining for Video Recommendation Based on Dance Styles , 2017, IEEE Transactions on Multimedia.

[34]  Lifeng Sun,et al.  Social-Aware Video Recommendation for Online Social Groups , 2017, IEEE Transactions on Multimedia.

[35]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[36]  Chang Zhou,et al.  ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation , 2017, AAAI.

[37]  En Wang,et al.  Improving Existing Collaborative Filtering Recommendations via Serendipity-Based Algorithm , 2018, IEEE Transactions on Multimedia.

[38]  Dong Liu,et al.  Temporal Hierarchical Attention at Category- and Item-Level for Micro-Video Click-Through Prediction , 2018, ACM Multimedia.

[39]  Yueting Zhuang,et al.  Social-Aware Movie Recommendation via Multimodal Network Learning , 2018, IEEE Transactions on Multimedia.

[40]  Guorui Zhou,et al.  Deep Interest Network for Click-Through Rate Prediction , 2017, KDD.

[41]  Changsheng Xu,et al.  Understanding Dynamic Cross-OSN Associations for Cold-Start Recommendation , 2018, IEEE Transactions on Multimedia.

[42]  Dongyan Zhao,et al.  Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots , 2019, WSDM.

[43]  Jinfeng Yi,et al.  Enhancing the Robustness of Neural Collaborative Filtering Systems Under Malicious Attacks , 2019, IEEE Transactions on Multimedia.

[44]  Ying Yang,et al.  Personalized Recommendation of Social Images by Constructing a User Interest Tree With Deep Features and Tag Trees , 2019, IEEE Transactions on Multimedia.

[45]  Chang Zhou,et al.  Deep Interest Evolution Network for Click-Through Rate Prediction , 2018, AAAI.

[46]  Zhenzhong Chen,et al.  User-Video Co-Attention Network for Personalized Micro-video Recommendation , 2019, WWW.

[47]  Yang Wang,et al.  Challenging Personalized Video Recommendation , 2016, 1612.06935.