The Pareto Regret Frontier for Bandits
暂无分享,去创建一个
[1] Yishay Mansour,et al. Regret to the best vs. regret to the average , 2007, Machine Learning.
[2] Marcus Hutter,et al. Adaptive Online Prediction by Following the Perturbed Leader , 2005, J. Mach. Learn. Res..
[3] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.
[4] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[5] Sébastien Bubeck. Bandits Games and Clustering Foundations , 2010 .
[6] HutterMarcus,et al. Adaptive Online Prediction by Following the Perturbed Leader , 2005 .
[7] Rina Panigrahy,et al. Prediction strategies without loss , 2010, NIPS.
[8] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[9] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.
[10] Lihong Li,et al. On the Prior Sensitivity of Thompson Sampling , 2015, ALT.
[11] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[12] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[13] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[14] Wouter M. Koolen. The Pareto Regret Frontier , 2013, NIPS.
[15] Tor Lattimore,et al. Optimally Confident UCB : Improved Regret for Finite-Armed Bandits , 2015, ArXiv.
[16] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[17] Alessandro Lazaric,et al. Exploiting easy data in online optimization , 2014, NIPS.
[18] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[19] Gábor Lugosi,et al. Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.
[20] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..