Balancing between Estimated Reward and Uncertainty during News Article Recommendation for ICML 2012 Exploration and Exploitation Challenge

Recommending relevant contents to users automatically in a web service is an important aspect that links with the income of many internet companies. The ICML 2012 Exploration & Exploitation Workshop holds an open challenge that aims at building stateof-the-art news article recommendation system on the Yahoo! platform. We propose an ecient scoring model that recommends the news article with the highest score during each user visit. The scoring model exploits by recommending the article with the highest estimated reward and explores articles with high reward potential by uncertainty measures. Three important aspects, global quality of articles, personal preference of users, and time eects are all considered in the scoring model. Furthermore, during the challenge, we adopt a systemic parameter tuning process to optimize the performance of the model. The tuned scoring model wins the rst place of phase one of the challenge.