Online Learning with Dependent Stochastic Feedback Graphs
暂无分享,去创建一个
Claudio Gentile | Corinna Cortes | Giulia DeSalvo | Mehryar Mohri | Ningshan Zhang | M. Mohri | Corinna Cortes | C. Gentile | Giulia DeSalvo | Ningshan Zhang
[1] Atilla Eryilmaz,et al. Stochastic bandits with side observations on networks , 2014, SIGMETRICS '14.
[2] Shie Mannor,et al. From Bandits to Experts: On the Value of Side-Observations , 2011, NIPS.
[3] Rémi Munos,et al. Efficient learning by implicit exploration in bandit problems with side observations , 2014, NIPS.
[4] Yifan Wu,et al. Online Learning with Gaussian Payoffs and Side Observations , 2015, NIPS.
[5] Gergely Neu,et al. Explore no more: Improved high-probability regret bounds for non-stochastic bandits , 2015, NIPS.
[6] Marc Lelarge,et al. Leveraging Side Observations in Stochastic Bandits , 2012, UAI.
[7] Fang Liu,et al. Information Directed Sampling for Stochastic Bandits with Graph Feedback , 2017, AAAI.
[8] Noga Alon,et al. Online Learning with Feedback Graphs: Beyond Bandits , 2015, COLT.
[9] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[10] Christos Dimitrakakis,et al. Thompson Sampling for Stochastic Bandits with Graph Feedback , 2017, AAAI.
[11] Claudio Gentile,et al. Online Learning with Abstention , 2017, ICML.
[12] Alan M. Frieze,et al. On the independence number of random graphs , 1990, Discret. Math..
[13] Tamir Hazan,et al. Online Learning with Feedback Graphs Without the Graphs , 2016, ICML 2016.
[14] Mehryar Mohri,et al. Learning with Rejection , 2016, ALT.
[15] Noga Alon,et al. From Bandits to Experts: A Tale of Domination and Independence , 2013, NIPS.
[16] S. Geer. On Hoeffding's Inequality for Dependent Random Variables , 2002 .
[17] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[18] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[19] Jinwoo Shin,et al. Multi-armed Bandit with Additional Observations , 2018, Proc. ACM Meas. Anal. Comput. Syst..
[20] Zheng Wen,et al. Stochastic Online Learning with Probabilistic Graph Feedback , 2019, AAAI.
[21] A. Burnetas,et al. Optimal Adaptive Policies for Sequential Allocation Problems , 1996 .
[22] Jean-Yves Audibert,et al. Lower bounds and selectivity of weak-consistent policies in stochastic multi-armed bandit problem , 2013, J. Mach. Learn. Res..
[23] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..
[24] Noga Alon,et al. Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback , 2014, SIAM J. Comput..
[25] Claudio Gentile,et al. Online Learning with Sleeping Experts and Feedback Graphs , 2019, ICML.
[26] Michal Valko,et al. Online Learning with Noisy Side Observations , 2016, AISTATS.
[27] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .