CLEF NewsREEL 2017: Contextual Bandit News Recommendation

In the CLEF NewsREEL 2017 challenge, we build a delegation model based on the contextual bandit algorithm. Our goal is to investigate whether a bandit approach combined with context extracted from the user side, from the item side and from user-item interaction can help choose the appropriate recommender from a recommender algorithm pool for the incoming recommendation requests. We took part in both tasks: NewsREEL Live and NewsREEL Replay. In the experiment, we test several bandit approaches with two types of context features. The result from NewsREEL Replay suggests that delegation model based on the contextual bandit algorithm can improve the click through rate (CTR). In NewsREEL Live, a similar delegation model is implemented. However, the delegation model from NewsREEL Live is trained by the data stream from NewsREEL Replay. This is due to the fact that the low volume of data received from the online scenario is not enough to support the training of the delegation model. For our future work, we will add more recommender algorithms to the recommender algorithm pool and explores other context features.

[1]  Sahin Albayrak,et al.  Real-time recommendations for user-item streams , 2015, SAC.

[2]  Liang Tang,et al.  Automatic ad format selection via contextual bandits , 2013, CIKM.

[3]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[4]  Wei Li,et al.  Exploitation and exploration in a performance based contextual advertising system , 2010, KDD.

[5]  Frank Hopfgartner,et al.  CLEF 2017 NewsREEL Overview: A Stream-Based Recommender Task for Evaluation and Education , 2017, CLEF.

[6]  Martha Larson,et al.  Overview of NewsREEL'16: Multi-dimensional Evaluation of Real-Time Stream-Recommendation Algorithms , 2016, CLEF.

[7]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[8]  Jing Yuan,et al.  Clicks Pattern Analysis for Online News Recommendation Systems , 2016, CLEF.

[9]  Martha Larson,et al.  Stream-Based Recommendations: Online and Offline Evaluation as a Service , 2015, CLEF.

[10]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[11]  Martha Larson,et al.  Benchmarking News Recommendations: The CLEF NewsREEL Use Case , 2016, SIGF.

[12]  Jason L. Loeppky,et al.  A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit , 2015, ArXiv.

[13]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[14]  John Myles White,et al.  Bandit Algorithms for Website Optimization , 2012 .

[15]  Frank Hopfgartner,et al.  Benchmarking News Recommendations in a Living Lab , 2014, CLEF.