Pure exploration in finitely-armed and continuous-armed bandits

[1]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[2]  R. Munos,et al.  Best Arm Identification in Multi-Armed Bandits , 2010, COLT.

[3]  Aleksandrs Slivkins,et al.  Sharp dichotomies for regret minimization in metric spaces , 2009, SODA '10.

[4]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[5]  Csaba Szepesvári,et al.  Online Optimization in X-Armed Bandits , 2008, NIPS.

[6]  A. Tsybakov,et al.  Gap-free Bounds for Stochastic Multi-Armed Bandit , 2008 .

[7]  Rémi Munos,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[8]  András György,et al.  Continuous Time Associative Bandit Problems , 2007, IJCAI.

[9]  Peter Auer,et al.  Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring , 2006, ALT.

[10]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[11]  K. Schlag ELEVEN - Tests needed for a Recommendation , 2006 .

[12]  Olivier Teytaud,et al.  Modification of UCT with Patterns in Monte-Carlo Go , 2006 .

[13]  Gilles Stoltz Incomplete information and internal regret in prediction of individual sequences , 2005 .

[14]  Gábor Lugosi,et al.  Minimizing regret with label efficient prediction , 2004, IEEE Transactions on Information Theory.

[15]  Robert D. Kleinberg Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.

[16]  Russell Greiner,et al.  The Budgeted Multi-armed Bandit Problem , 2004, COLT.

[17]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[18]  John N. Tsitsiklis,et al.  The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..

[19]  Shie Mannor,et al.  PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.

[20]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[21]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[22]  Y. Freund,et al.  The non-stochastic multi-armed bandit problem , 2001 .

[23]  Chun-Hung Chen,et al.  Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..

[24]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[25]  H. Robbins,et al.  Asymptotically efficient adaptive allocation rules , 1985 .

[26]  P. Billingsley,et al.  Convergence of Probability Measures , 1970, The Mathematical Gazette.

[27]  W. Hoeffding Probability inequalities for sum of bounded random variables , 1963 .

[28]  H. Robbins Some aspects of the sequential design of experiments , 1952 .