Balancing between Estimated Reward and Uncertainty during News Article Recommendation for ICML 2012 Exploration and Exploitation Challenge
暂无分享,去创建一个
[1] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[2] Shou-De Lin,et al. Novel Models and Ensemble Techniques to Discriminate Favorite Items from Unrated Ones for Personalized Music Recommendation , 2012, KDD Cup.
[3] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.
[4] Shou-De Lin,et al. A Linear Ensemble of Individual and Blended Models for Music Rating Prediction , 2012, KDD Cup.
[5] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[6] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[7] Martin Pál,et al. Contextual Multi-Armed Bandits , 2010, AISTATS.
[8] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[9] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.