A Fast Bandit Algorithm for Recommendation to Users With Heterogenous Tastes

We study recommendation in scenarios where there's no prior information about the quality of content in the system.We present an online algorithm that continually optimizes recommendation relevance based on behavior of past users. Our method trades weaker theoretical guarantees in asymptotic performance than the state-of-the-art for stronger theoretical guarantees in the online setting. We test our algorithm on real-world data collected from previous recommender systems and show that our algorithm learns faster than existing methods and performs equally well in the long-run.

[1]  Yisong Yue,et al.  Linear Submodular Bandits and their Application to Diversified Retrieval , 2011, NIPS.

[2]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[3]  Rajeev Rastogi,et al.  LogUCB: an explore-exploit algorithm for comments recommendation , 2012, CIKM '12.

[4]  Jan Vondrák,et al.  Maximizing a Monotone Submodular Function Subject to a Matroid Constraint , 2011, SIAM J. Comput..

[5]  Filip Radlinski,et al.  Learning optimally diverse rankings over large document collections , 2010, ICML.

[6]  S. Robertson The probability ranking principle in IR , 1997 .

[7]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[8]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[9]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[10]  Thorsten Joachims,et al.  Online learning to diversify from implicit feedback , 2012, KDD.

[11]  K. Fernow New York , 1896, American Potato Journal.

[12]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[13]  Debmalya Panigrahi,et al.  Online selection of diverse results , 2012, WSDM '12.

[14]  Andreas Krause,et al.  Online Learning of Assignments , 2009, NIPS.

[15]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[16]  David R. Karger,et al.  Less is More Probabilistic Models for Retrieving Fewer Relevant Documents , 2006 .

[17]  Amin Saberi,et al.  Correlation robust stochastic optimization , 2009, SODA '10.