The K-armed Dueling Bandits Problem
暂无分享,去创建一个
Thorsten Joachims | Yisong Yue | Robert D. Kleinberg | Josef Broder | Yisong Yue | T. Joachims | J. Broder
[1] Robert D. Kleinberg,et al. Regret bounds for sleeping experts and bandits , 2010, Machine Learning.
[2] Pinar Donmez,et al. On the local optimality of LambdaRank , 2009, SIGIR.
[3] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.
[4] Thorsten Joachims,et al. Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.
[5] Filip Radlinski,et al. How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.
[6] Avinatan Hassidim,et al. The Bayesian Learner is Optimal for Noisy Binary Search (and Pretty Good for Quantum as Well) , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.
[7] Mehryar Mohri,et al. An Efficient Reduction of Ranking to Classification , 2007, COLT.
[8] Rocco A. Servedio,et al. Boosting the Area under the ROC Curve , 2007, NIPS.
[9] J. Langford,et al. The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.
[10] Richard M. Karp,et al. Noisy binary search and its applications , 2007, SODA '07.
[11] Deepayan Chakrabarti,et al. Bandits for Taxonomies: A Model-based Approach , 2007, SDM.
[12] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[13] Nicolò Cesa-Bianchi,et al. Regret Minimization Under Partial Monitoring , 2006, 2006 IEEE Information Theory Workshop - ITW '06 Punta del Este.
[14] Anonymous Author. Robust Reductions from Ranking to Classification , 2006 .
[15] Thorsten Joachims,et al. A support vector method for multivariate performance measures , 2005, ICML.
[16] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[17] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[18] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[19] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[20] Klaus Obermayer,et al. Support vector learning for ordinal regression , 1999 .
[21] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .
[22] Yoram Singer,et al. Learning to Order Things , 1997, NIPS.
[23] Rajeev Motwani,et al. Randomized Algorithms , 1995, SIGA.
[24] Eli Upfal,et al. Computing with Noisy Information , 1994, SIAM J. Comput..
[25] Claire Mathieu,et al. Selection in the presence of noise: the design of playoff systems , 1994, SODA '94.
[26] N. Fisher,et al. Probability Inequalities for Sums of Bounded Random Variables , 1994 .
[27] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[28] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[29] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .