Contextual Dueling Bandits
暂无分享,去创建一个
Katja Hofmann | Miroslav Dudík | Aleksandrs Slivkins | Robert E. Schapire | Masrour Zoghi | R. Schapire | M. Zoghi | Miroslav Dudík | Katja Hofmann | Aleksandrs Slivkins
[1] Christos Faloutsos,et al. Tailoring click models to user goals , 2009, WSCD '09.
[2] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[3] M. de Rijke,et al. Copeland Dueling Bandits , 2015, NIPS.
[4] John Langford,et al. Contextual Bandit Algorithms with Supervised Learning Guarantees , 2010, AISTATS.
[5] Filip Radlinski,et al. Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.
[6] Thorsten Joachims,et al. Beat the Mean Bandit , 2011, ICML.
[7] Devavrat Shah,et al. Iterative ranking from pair-wise comparisons , 2012, NIPS.
[8] M. de Rijke,et al. Relative confidence sampling for efficient on-line ranker evaluation , 2014, WSDM.
[9] Thorsten Joachims,et al. Reducing Dueling Bandits to Cardinal Bandits , 2014, ICML.
[10] J. Langford,et al. The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.
[11] Thorsten Joachims,et al. Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.
[12] Eyke Hüllermeier,et al. A Survey of Preference-Based Online Learning with Bandit Algorithms , 2014, ALT.
[13] Thorsten Joachims,et al. The K-armed Dueling Bandits Problem , 2012, COLT.
[14] Markus Schulze,et al. A new monotonic, clone-independent, reversal symmetric, and condorcet-consistent single-winner election method , 2011, Soc. Choice Welf..
[15] M. Dufwenberg. Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.
[16] Chao Liu,et al. Efficient multiple-click models in web search , 2009, WSDM '09.
[17] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[18] Katja Hofmann,et al. Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods , 2013, TOIS.
[19] Raphaël Féraud,et al. Generic Exploration and K-armed Voting Bandits , 2013, ICML.
[20] Filip Radlinski,et al. How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.
[21] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[22] M. de Rijke,et al. Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem , 2013, ICML.
[23] M. de Rijke,et al. MergeRUCB: A Method for Large-Scale Online Ranker Evaluation , 2015, WSDM.
[24] Katja Hofmann,et al. A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.
[25] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[26] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[27] John Langford,et al. Efficient Optimal Learning for Contextual Bandits , 2011, UAI.