Advancements in Dueling Bandits
暂无分享,去创建一个
Katja Hofmann | Yisong Yue | Masrour Zoghi | Yanan Sui | M. Zoghi | Yisong Yue | Yanan Sui | Katja Hofmann
[1] Robert E. Schapire,et al. Instance-dependent Regret Bounds for Dueling Bandits , 2016, COLT.
[2] Hiroshi Nakagawa,et al. Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm , 2016, ICML.
[3] Johannes Fürnkranz,et al. A Survey of Preference-Based Reinforcement Learning Methods , 2017, J. Mach. Learn. Res..
[4] Thorsten Joachims,et al. Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.
[5] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[6] Ingemar J. Cox,et al. Multi-Dueling Bandits and Their Application to Online Ranker Evaluation , 2016, CIKM.
[7] Thorsten Joachims,et al. The K-armed Dueling Bandits Problem , 2012, COLT.
[8] M. de Rijke,et al. MergeRUCB: A Method for Large-Scale Online Ranker Evaluation , 2015, WSDM.
[9] Huasen Wu,et al. Double Thompson Sampling for Dueling Bandits , 2016, NIPS.
[10] Arun Rajkumar,et al. Dueling Bandits: Beyond Condorcet Winners to General Tournament Solutions , 2016, NIPS.
[11] Robert M Thrall,et al. Mathematics of Operations Research. , 1978 .
[12] Neil D. Lawrence,et al. Preferential Bayesian Optimization , 2017, ICML.
[13] Katja Hofmann,et al. A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.
[14] Peter Secretan. Learning , 1965, Mental Health.
[15] Filip Radlinski,et al. How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.
[16] Fabrice Clérot,et al. A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits , 2015, ICML.
[17] Eyke Hüllermeier,et al. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm , 2012, Mach. Learn..
[18] Eyke Hüllermeier,et al. Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach , 2015, NIPS.
[19] Filip Radlinski,et al. Predicting Search Satisfaction Metrics with Interleaved Comparisons , 2015, SIGIR.
[20] M. de Rijke,et al. Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem , 2013, ICML.
[21] Stefan Riezler,et al. Bandit structured prediction for learning from partial feedback in statistical machine translation , 2016, MTSUMMIT.
[22] Joel W. Burdick,et al. Multi-dueling Bandits with Dependent Arms , 2017, UAI.
[23] Filip Radlinski,et al. Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.
[24] Thorsten Joachims,et al. Beat the Mean Bandit , 2011, ICML.
[25] M. de Rijke,et al. Click-based Hot Fixes for Underperforming Torso Queries , 2016, SIGIR.
[26] Csaba Szepesvári,et al. Online Learning to Rank in Stochastic Click Models , 2017, ICML.
[27] M. de Rijke,et al. Relative confidence sampling for efficient on-line ranker evaluation , 2014, WSDM.
[28] Thorsten Joachims,et al. Reducing Dueling Bandits to Cardinal Bandits , 2014, ICML.
[29] M. de Rijke,et al. Copeland Dueling Bandits , 2015, NIPS.
[30] Robert D. Nowak,et al. Sparse Dueling Bandits , 2015, AISTATS.
[31] Nicolò Cesa-Bianchi,et al. Regret Minimization Under Partial Monitoring , 2006, 2006 IEEE Information Theory Workshop - ITW '06 Punta del Este.
[32] Hiroshi Nakagawa,et al. Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem , 2015, COLT.
[33] Raphaël Féraud,et al. Generic Exploration and K-armed Voting Bandits , 2013, ICML.
[34] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[35] Wataru Kumagai. Regret Analysis for Continuous Dueling Bandit , 2017, NIPS.
[36] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[37] Joel W. Burdick,et al. Stagewise Safe Bayesian Optimization with Gaussian Processes , 2018, ICML.
[38] Katja Hofmann,et al. Contextual Dueling Bandits , 2015, COLT.
[39] Joel W. Burdick,et al. Correlational Dueling Bandits with Application to Clinical Treatment in Large Decision Spaces , 2017, IJCAI.
[40] Bangrui Chen,et al. Dueling Bandits with Weak Regret , 2017, ICML.