暂无分享,去创建一个
[1] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[2] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[3] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[4] Akimichi Takemura,et al. An Asymptotically Optimal Bandit Algorithm for Bounded Support Models. , 2010, COLT 2010.
[5] Ole-Christoffer Granmo,et al. Solving two-armed Bernoulli bandit problems using a Bayesian learning automaton , 2010, Int. J. Intell. Comput. Cybern..
[6] Jean-Yves Audibert,et al. Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..
[7] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[8] Rémi Munos,et al. A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences , 2011, COLT.
[9] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[10] Aurélien Garivier,et al. On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.
[11] David S. Leslie,et al. Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2012, J. Mach. Learn. Res..
[12] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.