论文信息 - Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning

Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning

We consider online learning algorithms that guarantee worst-case regret rates in adversarial environments (so they can be deployed safely and will perform robustly), yet adapt optimally to favorable stochastic environments (so they will perform well in a variety of settings of practical importance). We quantify the friendliness of stochastic environments by means of the well-known Bernstein (a.k.a. generalized Tsybakov margin) condition. For two recent algorithms (Squint for the Hedge setting and MetaGrad for online convex optimization) we show that the particular form of their data-dependent individual-sequence regret guarantees implies that they adapt automatically to the Bernstein parameters of the stochastic environment. We prove that these algorithms attain fast rates in their respective settings both in expectation and with high probability.

Wouter M. Koolen | Peter Grünwald | Tim van Erven | P. Grünwald | T. Erven

[1] Jean-Yves Audibert. Fast learning rates in statistical inference through aggregation , 2007, math/0703854.

[2] Koby Crammer,et al. A generalized online mirror descent with applications to classification and regression , 2013, Machine Learning.

[3] P. Bartlett,et al. Empirical minimization , 2006 .

[4] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[5] V. Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.

[6] Gilles Stoltz,et al. A second-order bound with excess losses , 2014, COLT.

[7] Rong Jin,et al. 25th Annual Conference on Learning Theory Online Optimization with Gradual Variations , 2022 .

[8] Alessandro Lazaric,et al. Exploiting easy data in online optimization , 2014, NIPS.

[9] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[10] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[11] Percy Liang,et al. Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm , 2014, ICML.

[12] P. Massart,et al. Risk bounds for statistical learning , 2007, math/0702683.

[13] Yishay Mansour,et al. Improved second-order bounds for prediction with expert advice , 2006, Machine Learning.

[14] Wouter M. Koolen,et al. MetaGrad: Faster Convergence Without Curvature in Online Convex Optimization , 2016 .

[15] Robert C. Williamson,et al. From Stochastic Mixability to Fast Rates , 2014, NIPS.

[16] Peter Grünwald,et al. The Safe Bayesian - Learning the Learning Rate via the Mixability Gap , 2012, ALT.

[17] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[18] Mark D. Reid,et al. Fast rates in statistical and online learning , 2015, J. Mach. Learn. Res..

[19] Haipeng Luo,et al. Achieving All with No Parameters: Adaptive NormalHedge , 2015, ArXiv.

[20] Wouter M. Koolen,et al. Second-order Quantile Methods for Experts and Combinatorial Games , 2015, COLT.

[21] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .

[22] Elad Hazan,et al. Extracting certainty from uncertainty: regret bounded by variation in costs , 2008, Machine Learning.

[23] Olivier Wintenberger,et al. Optimal learning with Bernstein online aggregation , 2014, Machine Learning.

[24] Karthik Sridharan,et al. Online Nonparametric Regression , 2014, ArXiv.

[25] Wouter M. Koolen,et al. Follow the leader if you can, hedge if you must , 2013, J. Mach. Learn. Res..

[26] Yishay Mansour,et al. Regret to the best vs. regret to the average , 2007, Machine Learning.

[27] A. Tsybakov,et al. Optimal aggregation of classifiers in statistical learning , 2003 .

[28] Wouter M. Koolen,et al. MetaGrad: Multiple Learning Rates in Online Learning , 2016, NIPS.

[29] Matthew J. Streeter,et al. Adaptive Bound Optimization for Online Convex Optimization , 2010, COLT 2010.

[30] Wouter M. Koolen,et al. Learning the Learning Rate for Prediction with Expert Advice , 2014, NIPS.

[31] Koby Crammer,et al. Adaptive regularization of weight vectors , 2009, Machine Learning.

[32] Pierre Gaillard,et al. A Chaining Algorithm for Online Nonparametric Regression , 2015, COLT.