Exploitation and exploration in a performance based contextual advertising system
暂无分享,去创建一个
Wei Li | Ying Cui | Rong Jin | Ruofei Zhang | Jianchang Mao | Xuerui Wang | Rong Jin | Xuerui Wang | Wei Li | Ruofei Zhang | Ying-Zhi Cui | Jianchang Mao
[1] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[2] Bee-Chung Chen,et al. Explore/Exploit Schemes for Web Content Optimization , 2009, 2009 Ninth IEEE International Conference on Data Mining.
[3] Ben Gerson. The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture , 2005 .
[4] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[5] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Ambuj Tewari,et al. Efficient bandit algorithms for online multiclass prediction , 2008, ICML '08.
[8] P. Chatterjee,et al. Modeling the Clickstream: Implications for Web-Based Advertising Efforts , 2003 .
[9] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[10] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[11] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[12] Deepayan Chakrabarti,et al. Contextual advertising by combining relevance with click feedback , 2008, WWW.
[13] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..