Choosing the Best of Both Worlds: Diverse and Novel Recommendations through Multi-Objective Reinforcement Learning

Since the inception of Recommender Systems (RS), the accuracy of the recommendations in terms of relevance has been the golden criterion for evaluating the quality of RS algorithms. However, by focusing on item relevance, one pays a significant price in terms of other important metrics: users get stuck in a "filter bubble" and their array of options is significantly reduced, hence degrading the quality of the user experience and leading to churn. Recommendation, and in particular session-based/sequential recommendation, is a complex task with multiple and often conflicting objectives that existing state-of-the-art approaches fail to address. In this work, we take on the aforementioned challenge and introduce Scalarized Multi-Objective Reinforcement Learning (SMORL) for the RS setting, a novel Reinforcement Learning (RL) framework that can effectively address multi-objective recommendation tasks. The proposed SMORL agent augments standard recommendation models with additional RL layers that enforce it to simultaneously satisfy three principal objectives: accuracy, diversity, and novelty of recommendations. We integrate this framework with four stateof-the-art session-based recommendation models and compare it with a single-objective RL agent that only focuses on accuracy. Our experimental results on two real-world datasets reveal a substantial increase in aggregate diversity, a moderate increase in accuracy, reduced repetitiveness of recommendations, and demonstrate the importance of reinforcing diversity and novelty as complementary objectives.

[1]  Mike Gartrell,et al.  Tensorized Determinantal Point Processes for Recommendation , 2019, KDD.

[2]  Jionglong Su,et al.  News2vec: News Network Embedding with Subnode Information , 2019, EMNLP.

[3]  Mounia Lalmas,et al.  Algorithmic Effects on the Diversity of Consumption on Spotify , 2020, WWW.

[4]  Junyu Niu,et al.  A Framework for Recommending Relevant and Diverse Items , 2016, IJCAI.

[5]  Daniele Quercia,et al.  Auralist: introducing serendipity into music recommendation , 2012, WSDM '12.

[6]  Xiaoyan Zhu,et al.  Promoting Diversity in Recommendation by Entropy Regularizer , 2013, IJCAI.

[7]  Kartik Hosanagar,et al.  Recommender systems and their impact on sales diversity , 2007, EC '07.

[8]  Adriano Veloso,et al.  Pareto-efficient hybridization for multi-objective recommender systems , 2012, RecSys.

[9]  Laming Chen,et al.  Fast Greedy MAP Inference for Determinantal Point Process to Improve Recommendation Diversity , 2017, NeurIPS.

[10]  Zheng Wen,et al.  Optimal Greedy Diversity for Recommendation , 2015, IJCAI.

[11]  Shimon Whiteson,et al.  Multi-Objective Deep Reinforcement Learning , 2016, ArXiv.

[12]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[13]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[14]  Licia Capra,et al.  Temporal diversity in recommender systems , 2010, SIGIR.

[15]  Liang Zhang,et al.  Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning , 2018, KDD.

[16]  Alexandros Karatzoglou,et al.  Recurrent Neural Networks with Top-k Gains for Session-based Recommendations , 2017, CIKM.

[17]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[18]  Jiaxing Song,et al.  Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems , 2019, KDD.

[19]  Julian J. McAuley,et al.  Self-Attentive Sequential Recommendation , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[20]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Nikola Milojković,et al.  Multi-Gradient Descent for Multi-Objective Recommender Systems , 2020, ArXiv.

[23]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[24]  Rong Hu,et al.  Helping Users Perceive Recommendation Diversity , 2011, DiveRS@RecSys.

[25]  Yujing Hu,et al.  Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application , 2018, KDD.

[26]  Nicholas Jing Yuan,et al.  DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.

[27]  Yuan Qi,et al.  Generative Adversarial User Model for Reinforcement Learning Based Recommendation System , 2018, ICML.

[28]  Eli Pariser,et al.  The Filter Bubble: What the Internet Is Hiding from You , 2011 .

[29]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[30]  Aishwariya Budhrani,et al.  Music2Vec: Music Genre Classification and Recommendation System , 2020, 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA).

[31]  M. Lepper,et al.  Rethinking the value of choice: a cultural perspective on intrinsic motivation. , 1999, Journal of personality and social psychology.

[32]  Ed H. Chi,et al.  Top-K Off-Policy Correction for a REINFORCE Recommender System , 2018, WSDM.

[33]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[34]  Hui Xiong,et al.  Learning to Recommend Accurate and Diverse Items , 2017, WWW.

[35]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[36]  Alexandros Karatzoglou,et al.  Session-based Recommendations with Recurrent Neural Networks , 2015, ICLR.

[37]  Casper Hansen,et al.  Shifting Consumption towards Diverse Content on Music Streaming Platforms , 2021, WSDM.

[38]  J. Møller,et al.  Determinantal point process models and statistical inference , 2012, 1205.4818.

[39]  Edna Ullmann-Margalit,et al.  Second‐Order Decisions* , 1999, Ethics.

[40]  Ke Wang,et al.  Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding , 2018, WSDM.

[41]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[42]  Ajith Ramanathan,et al.  Practical Diversified Recommendations on YouTube with Determinantal Point Processes , 2018, CIKM.

[43]  Joemon M. Jose,et al.  A Simple Convolutional Generative Network for Next Item Recommendation , 2018, WSDM.

[44]  Jieping Ye,et al.  Environment Reconstruction with Hidden Confounders for Reinforcement Learning based Recommendation , 2019, KDD.