Combinatorial Bandits with Relative Feedback
暂无分享,去创建一个
[1] S. Dragomir,et al. Bounds for Kullback-Leibler divergence , 2016 .
[2] Thorsten Joachims,et al. Beat the Mean Bandit , 2011, ICML.
[3] Jon A. Krosnick,et al. The Measurement of Values in Surveys: A Comparison of Ratings and Rankings , 1985 .
[4] Alexandre Proutière,et al. Combinatorial Bandits Revisited , 2015, NIPS.
[5] Bruce E. Hajek,et al. Minimax-optimal Inference from Partial Rankings , 2014, NIPS.
[6] Wei Chen,et al. Combinatorial Multi-Armed Bandit: General Framework and Applications , 2013, ICML.
[7] Ravi Kumar,et al. On the Relevance of Irrelevant Alternatives , 2016, WWW.
[8] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[9] Vashist Avadhanula,et al. MNL-Bandit: A Dynamic Learning Approach to Assortment Selection , 2017, Oper. Res..
[10] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[11] Thorsten Joachims,et al. The K-armed Dueling Bandits Problem , 2012, COLT.
[12] Thore Graepel,et al. Ranking and Matchmaking , 2006 .
[13] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[14] Yuxin Chen,et al. Spectral MLE: Top-K Rank Aggregation from Pairwise Comparisons , 2015, ICML.
[15] Eyke Hüllermeier,et al. Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows , 2014, ICML.
[16] Csaba Szepesvári,et al. Minimax Regret of Finite Partial-Monitoring Games in Stochastic Environments , 2011, COLT.
[17] Katja Hofmann,et al. Fast and reliable online learning to rank for information retrieval , 2013, SIGIR Forum.
[18] Xi Chen,et al. A Nearly Instance Optimal Algorithm for Top-k Ranking under the Multinomial Logit Model , 2017, SODA.
[19] Zheng Wen,et al. DCM Bandits: Learning to Rank with Multiple Clicks , 2016, ICML.
[20] Filip Radlinski,et al. How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.
[21] Minje Jang,et al. Optimal Sample Complexity of M-wise Data for Top-K Ranking , 2017, NIPS.
[22] M. de Rijke,et al. Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem , 2013, ICML.
[23] Eyke Hüllermeier,et al. Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach , 2015, NIPS.
[24] Zheng Wen,et al. Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits , 2014, AISTATS.
[25] Ness B. Shroff,et al. PAC Ranking from Pairwise and Listwise Queries: Lower Bounds and Upper Bounds , 2018, ArXiv.
[26] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[27] M. Ben-Akiva,et al. Combining revealed and stated preferences data , 1994 .
[28] David C. Parkes,et al. Computing Parametric Ranking Models via Rank-Breaking , 2014, ICML.
[29] Vashist Avadhanula,et al. Thompson Sampling for the MNL-Bandit , 2017, COLT.
[30] Vashist Avadhanula,et al. A Near-Optimal Exploration-Exploitation Approach for Assortment Selection , 2016, EC.
[31] Aditya Gopalan,et al. Battle of Bandits , 2018, UAI.
[32] Huasen Wu,et al. Double Thompson Sampling for Dueling Bandits , 2016, NIPS.
[33] Thorsten Joachims,et al. Reducing Dueling Bandits to Cardinal Bandits , 2014, ICML.
[34] D. Hensher. Stated preference analysis of travel choices: the state of practice , 1994 .
[35] Hiroshi Nakagawa,et al. Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem , 2015, COLT.
[36] Thorsten Joachims,et al. Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.
[37] Aurélien Garivier,et al. Explore First, Exploit Next: The True Shape of Regret in Bandit Problems , 2016, Math. Oper. Res..
[38] Aditya Gopalan,et al. PAC Battling Bandits in the Plackett-Luce Model , 2018, ALT.
[39] Ashish Khetan,et al. Data-driven Rank Breaking for Efficient Rank Aggregation , 2016, J. Mach. Learn. Res..
[40] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[41] Eyke Hüllermeier,et al. A Survey of Preference-Based Online Learning with Bandit Algorithms , 2014, ALT.
[42] Joel W. Burdick,et al. Multi-dueling Bandits with Dependent Arms , 2017, UAI.
[43] Raphaël Féraud,et al. Generic Exploration and K-armed Voting Bandits , 2013, ICML.
[44] Ingemar J. Cox,et al. Multi-Dueling Bandits and Their Application to Online Ranker Evaluation , 2016, CIKM.
[45] Paul N. Bennett,et al. Pairwise ranking aggregation in a crowdsourced setting , 2013, WSDM.
[46] David C. Parkes,et al. Random Utility Theory for Social Choice , 2012, NIPS.