Bandits Under The Influence (Extended Version)

Recommender systems should adapt to user interests as the latter evolve. A prevalent cause for the evolution of user interests is the influence of their social circle. In general, when the interests are not known, online algorithms that explore the recommendation space while also exploiting observed preferences are preferable. We present online recommendation algorithms rooted in the linear multi-armed bandit literature. Our bandit algorithms are tailored precisely to recommendation scenarios where user interests evolve under social influence. In particular, we show that our adaptations of the classic LinREL and Thompson Sampling algorithms maintain the same asymptotic regret bounds as in the non-social case. We validate our approach experimentally using both synthetic and real datasets.

[1]  Jure Leskovec,et al.  Can cascades be predicted? , 2014, WWW.

[2]  Olivier Cappé,et al.  Weighted Linear Bandits for Non-Stationary Environments , 2019, NeurIPS.

[3]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[4]  Shipra Agrawal,et al.  Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.

[5]  Quanquan Gu,et al.  Contextual Bandits in a Collaborative Environment , 2016, SIGIR.

[6]  Francesco Ricci,et al.  Recommender Systems , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[7]  Benjamin Van Roy,et al.  A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..

[8]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[9]  Laks V. S. Lakshmanan,et al.  Optimal recommendations under attraction, aversion, and social influence , 2014, KDD.

[10]  Robert G. Gallager,et al.  Discrete Stochastic Processes , 1995 .

[11]  Thomas P. Hayes,et al.  Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.

[12]  Aleksandrs Slivkins,et al.  Introduction to Multi-Armed Bandits , 2019, Found. Trends Mach. Learn..

[13]  Wei Chu,et al.  Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.

[14]  Claudio Gentile,et al.  A Gang of Bandits , 2013, NIPS.

[15]  Devavrat Shah,et al.  Gossip Algorithms , 2009, Found. Trends Netw..

[16]  Csaba Szepesvári,et al.  Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.

[17]  Reynold Cheng,et al.  Online Influence Maximization , 2015, KDD.

[18]  Yehuda Koren,et al.  Collaborative filtering with temporal dynamics , 2009, KDD.

[19]  Sreenivas Gollapudi,et al.  Understanding cyclic trends in social choices , 2012, WSDM '12.

[20]  Anne Schuth,et al.  Online Learning to Rank for Recommender Systems , 2017, RecSys.

[21]  Yupeng Gu,et al.  The Co-Evolution Model for Social Network Evolving and Opinion Migration , 2017, KDD.

[22]  Kamesh Munagala,et al.  Modeling opinion dynamics in social networks , 2014, WSDM.

[23]  Shuai Li,et al.  Collaborative Filtering Bandits , 2015, SIGIR.

[24]  Tong Zhao,et al.  Improving recommendation accuracy using networks of substitutable and complementary products , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[25]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[26]  Dafna Shahaf,et al.  Turning down the noise in the blogosphere , 2009, KDD.

[27]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[28]  Lihong Li,et al.  Provable Optimal Algorithms for Generalized Linear Contextual Bandits , 2017, ArXiv.

[29]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[30]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.