Top Arm Identification in Multi-Armed Bandits with Batch Arm Pulls
暂无分享,去创建一个
Robert D. Nowak | Xiaojin Zhu | Kwang-Sung Jun | Kevin G. Jamieson | Xiaojin Zhu | R. Nowak | Kwang-Sung Jun
[1] Christian Igel,et al. Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search , 2009, ICML '09.
[2] András György,et al. Online Learning under Delayed Feedback , 2013, ICML.
[3] A. Pellegrini,et al. A longitudinal study of bullying, dominance, and victimization during the transition from primary school through secondary school , 2002 .
[4] Sébastien Bubeck,et al. Multiple Identifications in Multi-Armed Bandits , 2012, ICML.
[5] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[6] Oren Somekh,et al. Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.
[7] Robert D. Nowak,et al. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).
[8] E. Paulson. A Sequential Procedure for Selecting the Population with the Largest Mean from $k$ Normal Populations , 1964 .
[9] Robert E. Bechhofer,et al. A Sequential Multiple-Decision Procedure for Selecting the Best One of Several Normal Populations with a Common Unknown Variance, and Its Use with Various Experimental Designs , 1958 .
[10] M. Newton,et al. Drosophila RNAi screen identifies host genes important for influenza virus replication , 2008, Nature.
[11] Peter Stone,et al. Efficient Selection of Multiple Bandit Arms: Theory and Practice , 2010, ICML.
[12] Yifan Wu,et al. On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments , 2015, ICML.
[13] Erik Ordentlich,et al. On delayed prediction of individual sequences , 2002, IEEE Trans. Inf. Theory.
[14] Ambuj Tewari,et al. PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.
[15] Vianney Perchet,et al. Batched Bandit Problems , 2015, COLT.
[16] Gábor Lugosi,et al. Regret in Online Combinatorial Optimization , 2012, Math. Oper. Res..
[17] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[18] E. Ordentlich,et al. On delayed prediction of individual sequences , 2002, Proceedings IEEE International Symposium on Information Theory,.
[19] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[20] Mark Craven,et al. Limited Agreement of Independent RNAi Screens for Virus-Required Host Genes Owes More to False-Negative than False-Positive Factors , 2013, PLoS Comput. Biol..
[21] Andrew W. Moore,et al. Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.
[22] Alexandru Niculescu-Mizil. Multi-Armed Bandits with Betting , 2009 .
[23] Jun-Ming Xu,et al. Learning from Bullying Traces in Social Media , 2012, NAACL.
[24] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[25] Matthew Malloy,et al. lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits , 2013, COLT.