Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
暂无分享,去创建一个
Alessandro Lazaric | Mohammad Ghavamzadeh | Victor Gabillon | A. Lazaric | M. Ghavamzadeh | Victor Gabillon
[1] Andrew W. Moore,et al. Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.
[2] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[3] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[4] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[5] Csaba Szepesvári,et al. Empirical Bernstein stopping , 2008, ICML '08.
[6] Rémi Munos,et al. Pure Exploration for Multi-Armed Bandit Problems , 2008, ArXiv.
[7] Massimiliano Pontil,et al. Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.
[8] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[9] Peter Stone,et al. Efficient Selection of Multiple Bandit Arms: Theory and Practice , 2010, ICML.
[10] Dominik D. Freydenberger,et al. Can We Learn to Gamble Efficiently? , 2010, COLT.
[11] Shivaram Kalyanakrishnan. Learning Methods for Sequential Decision Making with Imperfect Representations by Shivaram Kalyanakrishnan , 2011 .
[12] Alessandro Lazaric,et al. Multi-Bandit Best Arm Identification , 2011, NIPS.
[13] Joelle Pineau,et al. Active Learning for Developing Personalized Treatment , 2011, UAI.
[14] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[15] Sébastien Bubeck,et al. Multiple Identifications in Multi-Armed Bandits , 2012, ICML.