暂无分享,去创建一个
Junpei Komiyama | Kaito Ariu | Kenichiro McAlinn | Masahiro Kato | Junpei Komiyama | K. McAlinn | Kaito Ariu | Masahiro Kato
[1] Michal Valko,et al. Fixed-Confidence Guarantees for Bayesian Best-Arm Identification , 2019, AISTATS.
[2] R. Khan,et al. Sequential Tests of Statistical Hypotheses. , 1972 .
[3] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[4] Aurélien Garivier,et al. Optimal Best Arm Identification with Fixed Confidence , 2016, COLT.
[5] Masashi Sugiyama,et al. Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback , 2019, Neural Computation.
[6] Wei Chen,et al. Combinatorial Pure Exploration of Multi-Armed Bandits , 2014, NIPS.
[7] Chun-Hung Chen,et al. Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..
[8] Masashi Sugiyama,et al. Fully adaptive algorithm for pure exploration in linear bandits , 2017, 1710.05552.
[9] Andrew W. Moore,et al. The Racing Algorithm: Model Selection for Lazy Learners , 1997, Artificial Intelligence Review.
[10] Robert D. Nowak,et al. Top Arm Identification in Multi-Armed Bandits with Batch Arm Pulls , 2016, AISTATS.
[11] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[12] Oren Somekh,et al. Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.
[13] I. Johnstone,et al. ASYMPTOTICALLY OPTIMAL PROCEDURES FOR SEQUENTIAL ADAPTIVE SELECTION OF THE BEST OF SEVERAL NORMAL MEANS , 1982 .
[14] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..
[15] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[16] Daniel Russo,et al. Simple Bayesian Algorithms for Best Arm Identification , 2016, COLT.
[17] Alexandre Proutiere,et al. Optimal Best-arm Identification in Linear Bandits , 2020, NeurIPS.
[18] Dominik D. Freydenberger,et al. Can We Learn to Gamble Efficiently? , 2010, COLT.
[19] Peter W. Glynn,et al. A large deviations perspective on ordinal optimization , 2004, Proceedings of the 2004 Winter Simulation Conference, 2004..
[20] Ilya O. Ryzhov,et al. On the Convergence Rates of Expected Improvement Methods , 2016, Oper. Res..
[21] Csaba Szepesvári,et al. Structured Best Arm Identification with Fixed Confidence , 2017, ALT.
[22] Alessandro Lazaric,et al. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.
[23] Steven L. Scott,et al. A modern Bayesian look at the multi-armed bandit , 2010 .
[24] Christian Igel,et al. Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search , 2009, ICML '09.
[25] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[26] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[27] Rémi Munos,et al. Stochastic Simultaneous Optimistic Optimization , 2013, ICML.
[28] Alexandros G. Dimakis,et al. Identifying Best Interventions through Online Importance Sampling , 2017, ICML.
[29] Wouter M. Koolen,et al. Monte-Carlo Tree Search by Best Arm Identification , 2017, NIPS.
[30] Maximilian Kasy,et al. Adaptive Treatment Assignment in Experiments for Policy Choice , 2019, Econometrica.
[31] Shivaram Kalyanakrishnan,et al. Information Complexity in Bandit Subset Selection , 2013, COLT.
[32] Diego Klabjan,et al. Improving the Expected Improvement Algorithm , 2017, NIPS.
[33] Matthew Malloy,et al. lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.
[34] Sébastien Bubeck,et al. Multiple Identifications in Multi-Armed Bandits , 2012, ICML.
[35] Robert E. Bechhofer,et al. Sequential identification and ranking procedures : with special reference to Koopman-Darmois populations , 1970 .
[36] E. Paulson. A Sequential Procedure for Selecting the Population with the Largest Mean from $k$ Normal Populations , 1964 .
[37] Yuan Zhou,et al. Best Arm Identification in Linear Bandits with Linear Dimension Dependency , 2018, ICML.
[38] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[39] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[40] Walter T. Federer,et al. Sequential Design of Experiments , 1967 .
[41] Alessandro Lazaric,et al. Best-Arm Identification in Linear Bandits , 2014, NIPS.
[42] Ameet Talwalkar,et al. Non-stochastic Best Arm Identification and Hyperparameter Optimization , 2015, AISTATS.
[43] Wouter M. Koolen,et al. Non-Asymptotic Pure Exploration by Solving Games , 2019, NeurIPS.
[44] Lalit Jain,et al. Sequential Experimental Design for Transductive Linear Bandits , 2019, NeurIPS.
[45] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[46] Peter L. Bartlett,et al. Best of both worlds: Stochastic & adversarial best-arm identification , 2018, COLT.
[47] Alexandra Carpentier,et al. Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem , 2016, COLT.