Improved Analysis of the Tsallis-INF Algorithm in Stochastically Constrained Adversarial Bandits and Stochastic Bandits with Adversarial Corruptions
暂无分享,去创建一个
[1] Renato Paes Leme,et al. Stochastic bandits robust to adversarial corruptions , 2018, STOC.
[2] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.
[3] Yevgeny Seldin,et al. An Algorithm for Stochastic and Adversarial Bandits with Switching Costs , 2021, ICML.
[4] C. Tsallis. Possible generalization of Boltzmann-Gibbs statistics , 1988 .
[5] Haipeng Luo,et al. Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition , 2020, NeurIPS.
[6] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[7] Haipeng Luo,et al. More Adaptive Algorithms for Adversarial Bandits , 2018, COLT.
[8] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[9] Aleksandrs Slivkins,et al. One Practical Algorithm for Both Stochastic and Adversarial Bandits , 2014, ICML.
[10] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[11] Ioannis Chatzigeorgiou,et al. Bounds on the Lambert Function and Their Application to the Outage Analysis of User Cooperation , 2013, IEEE Communications Letters.
[12] Julian Zimmert,et al. Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits , 2018, J. Mach. Learn. Res..
[13] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[14] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[15] Anupam Gupta,et al. Better Algorithms for Stochastic Bandits with Adversarial Corruptions , 2019, COLT.
[16] Tor Lattimore,et al. Refining the Confidence Level for Optimistic Bandit Strategies , 2018, J. Mach. Learn. Res..
[17] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[18] Yevgeny Seldin,et al. Tsallis-INF for Decoupled Exploration and Exploitation in Multi-armed Bandits. , 2020, COLT 2020.
[19] Gábor Lugosi,et al. An Improved Parametrization and Analysis of the EXP3++ Algorithm for Stochastic and Adversarial Bandits , 2017, COLT.
[20] Peter Auer,et al. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..
[21] Peter Auer,et al. An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits , 2016, COLT.
[22] Julian Zimmert,et al. Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously , 2019, ICML.
[23] Jean-Yves Audibert,et al. Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..
[24] Aleksandrs Slivkins,et al. 25th Annual Conference on Learning Theory The Best of Both Worlds: Stochastic and Adversarial Bandits , 2022 .
[25] Ambuj Tewari,et al. Fighting Bandits with a New Kind of Smoothness , 2015, NIPS.