Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits
暂无分享,去创建一个
Alessandro Lazaric | Alexandra Carpentier | Mohammad Ghavamzadeh | Peter Auer | R'emi Munos | Andr'as Antos | R. Munos | P. Auer | A. Antos | A. Lazaric | M. Ghavamzadeh | A. Carpentier
[1] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[2] W. J. Studden,et al. Theory Of Optimal Experiments , 1972 .
[3] T. J. Page. Multivariate Statistics: A Vector Space Approach , 1984 .
[4] M. L. Eaton. Multivariate statistics : a vector space approach , 1985 .
[5] David A. Cohn,et al. Active Learning with Statistical Models , 1996, NIPS.
[6] P. Brémaud. An Introduction to Probabilistic Modeling , 1994 .
[7] P. Chaudhuri,et al. On efficient designing of nonlinear experiments , 1995 .
[8] Robert D. Nowak,et al. Faster Rates in Regression via Active Learning , 2005, NIPS.
[9] P. Etoré,et al. Adaptive Optimal Allocation in Stratified Sampling Methods , 2007, 0711.4514.
[10] Massimiliano Pontil,et al. Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.
[11] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[12] Dominik D. Freydenberger,et al. Can We Learn to Gamble Efficiently? , 2010, COLT.
[13] Varun Grover,et al. Active learning in heteroscedastic noise , 2010, Theor. Comput. Sci..
[14] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..