Copeland Dueling Bandits
暂无分享,去创建一个
M. de Rijke | Shimon Whiteson | Maarten de Rijke | Masrour Zoghi | Zohar S. Karnin | M. Zoghi | Shimon Whiteson
[1] M. de Rijke,et al. MergeRUCB: A Method for Large-Scale Online Ranker Evaluation , 2015, WSDM.
[2] Katja Hofmann,et al. A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.
[3] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[4] Ron Kohavi,et al. Online controlled experiments at large scale , 2013, KDD.
[5] Katja Hofmann,et al. Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods , 2013, TOIS.
[6] Moshe Tennenholtz,et al. On the Axiomatic Foundations of Ranking Systems , 2005, IJCAI.
[7] Raphaël Féraud,et al. Generic Exploration and K-armed Voting Bandits , 2013, ICML.
[8] Markus Schulze,et al. A new monotonic, clone-independent, reversal symmetric, and condorcet-consistent single-winner election method , 2011, Soc. Choice Welf..
[9] Lihong Li,et al. Toward Predicting the Outcome of an A/B Experiment for Search Relevance , 2015, WSDM.
[10] Thorsten Joachims,et al. The K-armed Dueling Bandits Problem , 2012, COLT.
[11] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .
[12] Thorsten Joachims,et al. Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.
[13] Filip Radlinski,et al. How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.
[14] Alexander J. Smola,et al. Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations , 2012, ICML.
[15] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..
[16] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[17] M. de Rijke,et al. Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem , 2013, ICML.
[18] M. de Rijke,et al. Multileaved Comparisons for Fast Online Evaluation , 2014, CIKM.
[19] Eyke Hüllermeier,et al. Preference Learning , 2005, Künstliche Intell..
[20] Christian Schindelhauer,et al. Discrete Prediction Games with Arbitrary Feedback and Loss , 2001, COLT/EuroCOLT.
[21] M. de Rijke,et al. Relative confidence sampling for efficient on-line ranker evaluation , 2014, WSDM.
[22] Csaba Szepesvári,et al. –armed Bandits , 2022 .
[23] Rémi Munos,et al. Stochastic Simultaneous Optimistic Optimization , 2013, ICML.
[24] Csaba Szepesvári,et al. An adaptive algorithm for finite stochastic partial monitoring , 2012, ICML.
[25] Thorsten Joachims,et al. Reducing Dueling Bandits to Cardinal Bandits , 2014, ICML.
[26] Eyke Hüllermeier,et al. A Survey of Preference-Based Online Learning with Bandit Algorithms , 2014, ALT.
[27] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[28] Nick Craswell,et al. An experimental comparison of click position-bias models , 2008, WSDM '08.
[29] Eyke Hüllermeier,et al. Top-k Selection based on Adaptive Sampling of Noisy Preferences , 2013, ICML.
[30] Chao Liu,et al. Efficient multiple-click models in web search , 2009, WSDM '09.
[31] Katja Hofmann,et al. Contextual Dueling Bandits , 2015, COLT.
[32] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[33] R. Rivest,et al. An Optimal Single-Winner Preferential Voting System Based on Game Theory , 2010 .
[34] Eyke Hüllermeier,et al. PAC Rank Elicitation through Adaptive Sampling of Stochastic Pairwise Preferences , 2014, AAAI.
[35] Thorsten Joachims,et al. Beat the Mean Bandit , 2011, ICML.
[36] Devavrat Shah,et al. Iterative ranking from pair-wise comparisons , 2012, NIPS.
[37] Rémi Munos,et al. Optimistic Optimization of Deterministic Functions , 2011, NIPS 2011.
[38] Adam D. Bull,et al. Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..
[39] Katja Hofmann,et al. Information Retrieval manuscript No. (will be inserted by the editor) Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval , 2022 .
[40] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.