1 Perturbation Techniques in Online Learning and Optimization
暂无分享,去创建一个
[1] James Hannan,et al. 4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .
[2] D. Bertsekas. Stochastic optimization problems with nondifferentiable cost functionals , 1973 .
[3] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.
[4] Paul Glasserman,et al. Gradient Estimation Via Perturbation Analysis , 1990 .
[5] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[6] John Gittins,et al. Quantitative Methods in the Planning of Pharmaceutical Research , 1996 .
[7] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[8] M. Simon. Probability distributions involving Gaussian random variables : a handbook for engineers and scientists , 2002 .
[9] William H. Sandholm,et al. ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .
[10] I. Molchanov. Theory of Random Sets , 2005 .
[11] Tapio Elomaa,et al. On Following the Perturbed Leader in the Bandit Setting , 2005, ALT.
[12] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).
[13] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[14] Ambuj Tewari,et al. Optimal Stragies and Minimax Lower Bounds for Online Convex Games , 2008, COLT.
[15] Guy Van den Broeck,et al. Monte-Carlo Tree Search in Poker Using Expected Reward Distributions , 2009, ACML.
[16] Jacob D. Abernethy,et al. Beating the adaptive bandit with high probability , 2009, 2009 Information Theory and Applications Workshop.
[17] A. Nedić,et al. Convex nondifferentiable stochastic optimization: A local randomized smoothing technique , 2010, Proceedings of the 2010 American Control Conference.
[18] H. Brendan McMahan,et al. Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization , 2011, AISTATS.
[19] Ambuj Tewari,et al. On the Universality of Online Mirror Descent , 2011, NIPS.
[20] Una-May O'Reilly,et al. Hyperparameter Tuning in Bandit-Based Adaptive Operator Selection , 2012, EvoApplications.
[21] Martin J. Wainwright,et al. Randomized Smoothing for Stochastic Optimization , 2011, SIAM J. Optim..
[22] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[23] Marc Teboulle,et al. Smoothing and First Order Methods: A Unified Framework , 2012, SIAM J. Optim..
[24] Ohad Shamir,et al. Relax and Randomize : From Value to Algorithms , 2012, NIPS.
[25] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..
[26] Gergely Neu,et al. An Efficient Algorithm for Learning with Semi-bandit Feedback , 2013, ALT.
[27] Jennifer Wortman Vaughan,et al. Efficient Market Making via Convex Optimization, and a Connection to Online Learning , 2013, TEAC.
[28] Luc Devroye,et al. Prediction by random-walk perturbation , 2013, COLT.
[29] Ambuj Tewari,et al. Online Linear Optimization via Smoothing , 2014, COLT.
[30] Rémi Munos,et al. Efficient learning by implicit exploration in bandit problems with side observations , 2014, NIPS.
[31] Wojciech Kotlowski,et al. Follow the Leader with Dropout Perturbations , 2014, COLT.
[32] Ambuj Tewari,et al. Fighting Bandits with a New Kind of Smoothness , 2015, NIPS.
[33] Yurii Nesterov,et al. Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.