BiUCB: A Contextual Bandit Algorithm for Cold-Start and Diversified Recommendation

In web-based scenarios, new users and new items frequently join the recommendation system over time without prior events. In addition, users always hold dynamic and diversified preferences. Therefore, cold-start and diversity are two serious challenges of the recommendation system. Recent works show that these problems can be effectively solved by contextual multi-armed bandit (CMAB) algorithms which consider the coldstart and diversified recommendation process as a bandit game. But existing methods only treat either items or users as arms, causing a lower accuracy on the other side. In this paper, we propose a novel bandit algorithm called binary upper confidence bound (BiUCB), which employs a binary UCB to consider both users and items to be arms of each other. BiUCB can deal with the item-user-cold-start problem where there is no information about users and items. Furthermore, BiUCB and k-ε-greedy can be combined as a switching algorithm which lead to significant improvement of the temporal diversity of entire recommendation. Extensive experiments on real world datasets demonstrate the precision of BiUCB and the diversity of switching algorithm.

[1]  Licia Capra,et al.  Temporal diversity in recommender systems , 2010, SIGIR.

[2]  Zheng Wen,et al.  Optimal Greedy Diversity for Recommendation , 2015, IJCAI.

[3]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[4]  William W. Cohen,et al.  Community-Based Recommendations: a Solution to the Cold Start Problem , 2011 .

[5]  Nicholas Jing Yuan,et al.  A Novelty-Seeking based Dining Recommender System , 2015, WWW.

[6]  Yisong Yue,et al.  Hierarchical Exploration for Accelerating Contextual Bandits , 2012, ICML.

[7]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8]  Wei Chu,et al.  Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.

[9]  Junyu Niu,et al.  A Framework for Recommending Relevant and Diverse Items , 2016, IJCAI.

[10]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[11]  H. Robbins,et al.  Asymptotically efficient adaptive allocation rules , 1985 .

[12]  Qing Wang,et al.  Online Context-Aware Recommendation with Time Varying Multi-Armed Bandit , 2016, KDD.

[13]  Xiaoyan Zhu,et al.  Promoting Diversity in Recommendation by Entropy Regularizer , 2013, IJCAI.

[14]  Rajeev Rastogi,et al.  LogUCB: an explore-exploit algorithm for comments recommendation , 2012, CIKM '12.

[15]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[16]  Christopher Meek,et al.  A unified approach to building hybrid recommender systems , 2009, RecSys '09.

[17]  Scott Sanner,et al.  Social collaborative filtering for cold-start recommendations , 2014, RecSys '14.

[18]  Gueorgi Kossinets,et al.  Empirical Analysis of an Evolving Social Network , 2006, Science.

[19]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[20]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[21]  N. Dyn,et al.  Multivariate Approximation and Applications: Index , 2001 .

[22]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[23]  Xiaoyan Zhu,et al.  Contextual Combinatorial Bandit and its Application on Diversified Online Recommendation , 2014, SDM.

[24]  Liang Tang,et al.  Ensemble contextual bandits for personalized recommendation , 2014, RecSys '14.