Optimal $\delta$-Correct Best-Arm Selection for General Distributions
暂无分享,去创建一个
[1] L. J. Savage,et al. The nonexistence of certain statistical procedures in nonparametric problems , 1956 .
[2] H. Chernoff. Sequential Design of Experiments , 1959 .
[3] E. Lehmann. Testing Statistical Hypotheses , 1960 .
[4] E. Paulson. A Sequential Procedure for Selecting the Population with the Largest Mean from $k$ Normal Populations , 1964 .
[5] P. Billingsley,et al. Convergence of Probability Measures , 1970, The Mathematical Gazette.
[6] Robert E. Bechhofer,et al. Sequential identification and ranking procedures : with special reference to Koopman-Darmois populations , 1970 .
[7] D. Luenberger. Optimization by Vector Space Methods , 1968 .
[8] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[9] A. Fiacco,et al. Sensitivity and stability analysis for nonlinear programming , 1991 .
[10] David Williams,et al. Probability with Martingales , 1991, Cambridge mathematical textbooks.
[11] Yu-Chi Ho,et al. Ordinal optimization of DEDS , 1992, Discret. Event Dyn. Syst..
[12] L. Dai. Convergence properties of ordinal comparison in the simulation of discrete event dynamic systems , 1995 .
[13] A. Burnetas,et al. Optimal Adaptive Policies for Sequential Allocation Problems , 1996 .
[14] R. Sundaram. A First Course in Optimization Theory , 1996 .
[15] Chun-Hung Chen,et al. Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..
[16] Barry L. Nelson,et al. A fully sequential procedure for indifference-zone selection in simulation , 2001, TOMC.
[17] A. Müller,et al. Comparison Methods for Stochastic Models and Risks , 2002 .
[18] C. Villani. Topics in Optimal Transportation , 2003 .
[19] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[20] Peter W. Glynn,et al. A large deviations perspective on ordinal optimization , 2004, Proceedings of the 2004 Winter Simulation Conference, 2004..
[21] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[22] Akimichi Takemura,et al. An Asymptotically Optimal Bandit Algorithm for Bounded Support Models. , 2010, COLT 2010.
[23] Dominik D. Freydenberger,et al. Can We Learn to Gamble Efficiently? , 2010, COLT.
[24] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..
[25] Akimichi Takemura,et al. An asymptotically optimal policy for finite support models in the multiarmed bandit problem , 2009, Machine Learning.
[26] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[27] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[28] Alexandre Proutière,et al. Lipschitz Bandits: Regret Lower Bound and Optimal Algorithms , 2014, COLT.
[29] Matthew Malloy,et al. lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.
[30] Akimichi Takemura,et al. Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards , 2015, J. Mach. Learn. Res..
[31] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[32] Daniel Russo,et al. Simple Bayesian Algorithms for Best Arm Identification , 2016, COLT.
[33] Aurélien Garivier,et al. Optimal Best Arm Identification with Fixed Confidence , 2016, COLT.
[34] Subhashini Krishnasamy,et al. Sample complexity of partition identification using multi-armed bandits , 2018, COLT.