Nonstochastic Bandits with Composite Anonymous Feedback
暂无分享,去创建一个
[1] Chris Mesterharm,et al. On-line Learning with Delayed Label Feedback , 2005, ALT.
[2] Shie Mannor,et al. Online Learning for Adversaries with Memory: Price of Past Mistakes , 2015, NIPS.
[3] Erik Ordentlich,et al. On delayed prediction of individual sequences , 2002, IEEE Trans. Inf. Theory.
[4] Elad Hazan,et al. Interior-Point Methods for Full-Information and Bandit Online Learning , 2012, IEEE Transactions on Information Theory.
[5] Csaba Szepesvári,et al. Bandits with Delayed Anonymous Feedback , 2017, ArXiv.
[6] András György,et al. Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms , 2016, AAAI.
[7] Ambuj Tewari,et al. Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret , 2012, ICML.
[8] Thomas P. Hayes,et al. The Price of Bandit Information for Online Optimization , 2007, NIPS.
[9] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[10] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[11] Elad Hazan,et al. The Blinded Bandit: Learning with Adaptive Feedback , 2014, NIPS.
[12] Nate Soares,et al. Asymptotic Convergence in Online Learning with Unbounded Delays , 2016, ArXiv.
[13] Sham M. Kakade,et al. Towards Minimax Policies for Online Linear Optimization with Bandit Feedback , 2012, COLT.
[14] Ohad Shamir,et al. On the Complexity of Bandit Linear Optimization , 2014, COLT.
[15] Kent Quanrud,et al. Online Learning with Adversarial Delays , 2015, NIPS.
[16] John Langford,et al. Slow Learners are Fast , 2009, NIPS.
[17] Ambuj Tewari,et al. Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback , 2011, AISTATS.
[18] András György,et al. Online Learning under Delayed Feedback , 2013, ICML.
[19] Zoran Popovic,et al. The Queue Method: Handling Delay, Heuristics, Prior Data, and Evaluation in Bandits , 2015, AAAI.
[20] Claudio Gentile,et al. Delay and Cooperation in Nonstochastic Bandits , 2016, COLT.
[21] Ohad Shamir,et al. Online Learning with Local Permutations and Delayed Feedback , 2017, ICML.
[22] Yuval Peres,et al. Online Learning with Composite Loss Functions , 2014, COLT.
[23] Kent Quanrud,et al. Adversarial Delays in Online Strongly-Convex Optimization , 2016, ArXiv.
[24] Csaba Szepesvári,et al. Online Markov Decision Processes Under Bandit Feedback , 2010, IEEE Transactions on Automatic Control.
[25] Peter Auer,et al. Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring , 2006, ALT.
[26] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..
[27] Vianney Perchet,et al. Stochastic Bandit Models for Delayed Conversions , 2017, UAI.