Good arm identification via bandit feedback
暂无分享,去创建一个
Masashi Sugiyama | Junya Honda | Atsuyoshi Nakamura | Kentaro Sakamaki | Hideaki Kano | Kentaro Matsuura | Masashi Sugiyama | J. Honda | Atsuyoshi Nakamura | H. Kano | Kentaro Matsuura | Kentaro Sakamaki
[1] E. Keystone,et al. Determining the Minimally Important Difference in the Clinical Disease Activity Index for Improvement and Worsening in Early Rheumatoid Arthritis Patients , 2015, Arthritis care & research.
[2] Shie Mannor,et al. Action Elimination and Stopping Conditions for Reinforcement Learning , 2003, ICML.
[3] Robert D. Nowak,et al. Top Arm Identification in Multi-Armed Bandits with Batch Arm Pulls , 2016, AISTATS.
[4] Liang Tang,et al. Personalized Recommendation via Parameter-Free Contextual Bandits , 2015, SIGIR.
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] Feng Liu,et al. Design considerations and analysis planning of a phase 2a proof of concept study in rheumatoid arthritis in the presence of possible non-monotonicity , 2017, BMC Medical Research Methodology.
[7] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[8] Xi Chen,et al. Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing , 2014, ICML.
[9] Alexandra Carpentier,et al. An optimal algorithm for the Thresholding Bandit Problem , 2016, ICML.
[10] Patrick Durez,et al. Efficacy and safety of secukinumab in patients with rheumatoid arthritis: a phase II, dose-finding, double-blind, randomised, placebo controlled study , 2012, Annals of the rheumatic diseases.
[11] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[12] Jürgen Branke,et al. Integrating Techniques from Statistical Ranking into Evolutionary Algorithms , 2006, EvoWorkshops.
[13] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .
[14] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[15] Andrew P Grieve,et al. ASTIN: a Bayesian adaptive dose–response trial in acute stroke , 2005, Clinical trials.
[16] Balaraman Ravindran,et al. Thresholding Bandits with Augmented UCB , 2017, IJCAI.
[17] A. Law,et al. A procedure for selecting a subset of size m containing the l best of k independent normal populations, with applications to simulation , 1985 .
[18] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[19] Matthew Malloy,et al. lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.
[20] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[21] Edward S. Kim,et al. The BATTLE trial: personalizing therapy for lung cancer. , 2011, Cancer discovery.
[22] Stefano Zamuner,et al. Safety, tolerability, pharmacokinetics and pharmacodynamics of an anti- oncostatin M monoclonal antibody in rheumatoid arthritis: results from phase II randomized, placebo-controlled trials , 2013, Arthritis Research & Therapy.