暂无分享,去创建一个
[1] Filip Radlinski,et al. Active exploration for learning rankings from clickthrough data , 2007, KDD '07.
[2] Eyke Hüllermeier,et al. Preference Learning , 2005, Künstliche Intell..
[3] Eyke Hllermeier,et al. Preference Learning , 2010 .
[4] Richard M. Karp,et al. Noisy binary search and its applications , 2007, SODA '07.
[5] Eyke Hüllermeier,et al. Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows , 2014, ICML.
[6] Irène Charon,et al. An updated survey on the linear ordering problem for weighted or unweighted tournaments , 2010, Ann. Oper. Res..
[7] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[8] Eyke Hüllermeier,et al. Top-k Selection based on Adaptive Sampling of Noisy Preferences , 2013, ICML.
[9] M. de Rijke,et al. Relative confidence sampling for efficient on-line ranker evaluation , 2014, WSDM.
[10] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .
[11] Thorsten Joachims,et al. Reducing Dueling Bandits to Cardinal Bandits , 2014, ICML.
[12] Filip Radlinski,et al. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.
[13] Tie-Yan Liu,et al. Learning to rank for information retrieval , 2009, SIGIR.
[14] Christian Schindelhauer,et al. Discrete Prediction Games with Arbitrary Feedback and Loss , 2001, COLT/EuroCOLT.
[15] Tao Qin,et al. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .
[16] Thorsten Joachims,et al. The K-armed Dueling Bandits Problem , 2012, COLT.
[17] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[18] Csaba Szepesvári,et al. Partial Monitoring - Classification, Regret Bounds, and Algorithms , 2014, Math. Oper. Res..
[19] Thorsten Joachims,et al. Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.
[20] M. de Rijke,et al. MergeRUCB: A Method for Large-Scale Online Ranker Evaluation , 2015, WSDM.
[21] Filip Radlinski,et al. Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.
[22] Eyke Hüllermeier,et al. A Survey of Preference-Based Online Learning with Bandit Algorithms , 2014, ALT.
[23] Thorsten Joachims,et al. Beat the Mean Bandit , 2011, ICML.
[24] Peter Auer,et al. Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments , 2013, EWRL.
[25] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[26] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[27] M. de Rijke,et al. Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem , 2013, ICML.
[28] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[29] Raphaël Féraud,et al. Generic Exploration and K-armed Voting Bandits , 2013, ICML.
[30] Gábor Bartók,et al. A near-optimal algorithm for finite partial-monitoring games against adversarial opponents , 2013, COLT.
[31] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.