Dueling Bandits: From Two-dueling to Multi-dueling
暂无分享,去创建一个
Longbo Huang | Yihan Du | Siwei Wang | Siwei Wang | Yihan Du | Longbo Huang
[1] Joel W. Burdick,et al. Clinical online recommendation with subgroup rank feedback , 2014, RecSys '14.
[2] Feller William,et al. An Introduction To Probability Theory And Its Applications , 1950 .
[3] M. de Rijke,et al. Copeland Dueling Bandits , 2015, NIPS.
[4] Filip Radlinski,et al. Mortal Multi-Armed Bandits , 2008, NIPS.
[5] Ingemar J. Cox,et al. Multi-Dueling Bandits and Their Application to Online Ranker Evaluation , 2016, CIKM.
[6] Thorsten Joachims,et al. Reducing Dueling Bandits to Cardinal Bandits , 2014, ICML.
[7] Ingemar J. Cox,et al. An Improved Multileaving Algorithm for Online Ranker Evaluation , 2016, SIGIR.
[8] Kojiro Iizuka,et al. Greedy optimized multileaving for personalization , 2019, RecSys.
[9] Pushmeet Kohli,et al. A Fast Bandit Algorithm for Recommendation to Users With Heterogenous Tastes , 2013, AAAI.
[10] Peter Auer,et al. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..
[11] Joel W. Burdick,et al. Multi-dueling Bandits with Dependent Arms , 2017, UAI.
[12] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[13] Katja Hofmann,et al. Information Retrieval manuscript No. (will be inserted by the editor) Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval , 2022 .
[14] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[15] Thorsten Joachims,et al. Beat the Mean Bandit , 2011, ICML.
[16] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.
[17] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[18] Lilian Besson,et al. What Doubling Tricks Can and Can't Do for Multi-Armed Bandits , 2018, ArXiv.
[19] Maarten de Rijke,et al. Sensitive and Scalable Online Evaluation with Theoretical Guarantees , 2017, CIKM.
[20] Huazheng Wang,et al. Efficient Exploration of Gradient Space for Online Learning to Rank , 2018, SIGIR.
[21] Tao Qin,et al. Introducing LETOR 4.0 Datasets , 2013, ArXiv.
[22] Thorsten Joachims,et al. The K-armed Dueling Bandits Problem , 2012, COLT.
[23] Filip Radlinski,et al. Predicting Search Satisfaction Metrics with Interleaved Comparisons , 2015, SIGIR.
[24] Stephen E. Robertson,et al. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.
[25] M. de Rijke,et al. Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem , 2013, ICML.
[26] Raphaël Féraud,et al. Generic Exploration and K-armed Voting Bandits , 2013, ICML.
[27] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[28] Azadeh Shakery,et al. Online Learning to Rank for Cross-Language Information Retrieval , 2017, SIGIR.