A Multi-Agent Framework for Recommendation with Heterogeneous Sources

With the ever prospering of the web technologies, there is a common need to make recommendations from heterogeneous sources, such as recommending products and advertisements together on the e-commerce websites. People usually solve such recommendation problem by a two-stage paradigm, where the first stage is generating candidates from each source, and the second one is aggregating and ranking the generated heterogeneous candidates to produce the final results. While existing models have achieved many successes, they mostly optimize the above two stages separately, where the user preferences can only be used to supervise the second stage, while for the first one, there is no signal to tell whether the generated candidates are accurate enough to cover the user preference. To solve the above problem, in this paper, we design a multi-agent framework to jointly optimize the above two stages. In specific, suppose there are N sources in our problem, then we deploy N+1 agents, where the first N agents correspond one-to-one with the sources, aiming to select the sources-specific candidates, and the last agent is designed to aggregate the candidates from different sources for the final recommendation. All the agents play a cooperative game, aiming to maximize the rewards revealing user preferences. We implement our idea based on the Deep Q-network, where we design a decomposable reward to enhance the training efficiency. We adapt our model to a real-world recommendation problem abstracted from a famous short video platform-Kuaishou.com. We conduct extensive experiments to demonstrate the effectiveness of our model.

[1]  Michael I. Jordan,et al.  On component interactions in two-stage recommender systems , 2021, NeurIPS.

[2]  Long Xia,et al.  Reinforcement Recommendation with User Multi-aspect Preference , 2021, WWW.

[3]  Hongning Wang,et al.  Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation , 2019, ArXiv.

[4]  Jiliang Tang,et al.  DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems , 2019, AAAI.

[5]  Marcus O’Dair,et al.  Beyond the black box in music streaming: the impact of recommendation systems upon artists , 2019, Spotification of Popular Culture in the Field of Popular Communication.

[6]  Keping Yang,et al.  Deep Session Interest Network for Click-Through Rate Prediction , 2019, IJCAI.

[7]  Bo Zheng,et al.  Aggregating E-commerce Search Results from Heterogeneous Sources via Hierarchical Reinforcement Learning , 2019, WWW.

[8]  Yuan Qi,et al.  Generative Adversarial User Model for Reinforcement Learning Based Recommendation System , 2018, ICML.

[9]  Liang Zhang,et al.  Deep reinforcement learning for page-wise recommendations , 2018, RecSys.

[10]  Nicholas Jing Yuan,et al.  DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.

[11]  Dik Lun Lee,et al.  Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba , 2018, KDD.

[12]  Liang Zhang,et al.  Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning , 2018, KDD.

[13]  Liang Zhang,et al.  Deep Reinforcement Learning for List-wise Recommendations , 2017, ArXiv.

[14]  Zhaochun Ren,et al.  Neural Attentive Session-based Recommendation , 2017, CIKM.

[15]  Shujian Huang,et al.  Deep Matrix Factorization Models for Recommender Systems , 2017, IJCAI.

[16]  Joel Z. Leibo,et al.  Value-Decomposition Networks For Cooperative Multi-Agent Learning , 2017, ArXiv.

[17]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[18]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[19]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  Maria Luisa Hernández-Alcaraz,et al.  Social knowledge-based recommender system. Application to the movies domain , 2012, Expert Syst. Appl..

[21]  Fernando Diaz,et al.  Sources of evidence for vertical selection , 2009, SIGIR.

[22]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[23]  Fernando Diaz,et al.  Integration of news content into web results , 2009, WSDM '09.

[24]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[25]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[26]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[27]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[28]  Bart De Schutter,et al.  Multi-agent Reinforcement Learning: An Overview , 2010 .

[29]  P. Schrimpf,et al.  Dynamic Programming , 2011 .

[30]  Lakhmi C. Jain,et al.  Innovations in Multi-Agent Systems and Applications - 1 , 2010 .

[31]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[32]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .