Smoothed Online Learning is as Easy as Statistical Learning

Much of modern learning theory has been split between two regimes: the classical offline setting, where data arrive independently, and the online setting, where data arrive adversarially. While the former model is often both computationally and statistically tractable, the latter requires no distributional assumptions. In an attempt to achieve the best of both worlds, previous work proposed the smooth online setting where each sample is drawn from an adversarially chosen distribution, which is smooth , i.e., it has a bounded density with respect to a fixed dom-inating measure. Existing results for the smooth setting were known only for binary-valued function classes and were computationally expensive in general; in this paper, we fill these lacunae. In particular, we provide tight bounds on the minimax regret of learning a nonparametric function class, with nearly optimal dependence on both the horizon and smoothness parameters. Furthermore, we provide the first oracle-efficient, no-regret algorithms in this setting. In particular, we propose an oracle-efficient improper algorithm whose regret achieves optimal dependence on the horizon and a proper algorithm requiring only a single oracle call per round whose regret has the optimal horizon dependence in the classification setting and is sublinear in general. Both algorithms have exponentially worse dependence on the smoothness parameter of the adversary than the minimax rate. We then prove a lower bound on the oracle complexity of any proper learning algorithm, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning. Finally, we apply our results to the contextual bandit setting to show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits in the case that contexts arrive in a smooth manner. in the theorem statements). We take L = 1 and suppress logarithmic factors where possible.

[1]  Oracle-Efficient Online Learning for Beyond Worst-Case Adversaries , 2022, ArXiv.

[2]  C. Daskalakis,et al.  Fast rates for nonparametric online learning: from realizability to learning in games , 2021, STOC.

[3]  Tim Roughgarden,et al.  Smoothed Analysis with Adaptive Adversaries , 2021, 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS).

[4]  David Simchi-Levi,et al.  Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability , 2020, Math. Oper. Res..

[5]  Dylan J. Foster,et al.  The Statistical Complexity of Interactive Decision Making , 2021, ArXiv.

[6]  Alexander Rakhlin,et al.  Majorizing Measures, Sequential Complexities, and Online Learning , 2021, COLT.

[7]  David Simchi-Levi,et al.  Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective , 2020, COLT.

[8]  Bodo Manthey Smoothed Analysis of Local Search , 2020, Beyond the Worst-Case Analysis of Algorithms.

[9]  Kane,et al.  Beyond the Worst-Case Analysis of Algorithms , 2020 .

[10]  Tim Roughgarden,et al.  Smoothed Analysis of Online and Differentially Private Learning , 2020, NeurIPS.

[11]  Alexander Rakhlin,et al.  Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles , 2020, ICML.

[12]  Ruta Mehta,et al.  Smoothed Efficient Algorithms and Reductions for Network Coordination Games , 2018, ITCS.

[13]  Alon Gonen,et al.  Learning in Non-convex Games with an Optimization Oracle , 2018, COLT.

[14]  Haipeng Luo,et al.  Oracle-Efficient Online Learning and Auction Design , 2016, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[15]  Elad Hazan,et al.  The computational power of optimization in online learning , 2015, STOC.

[16]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[17]  Ambuj Tewari,et al.  Fighting Bandits with a New Kind of Smoothness , 2015, NIPS.

[18]  Tamir Hazan,et al.  Following the Perturbed Leader for Online Structured Learning , 2015, ICML.

[19]  Pierre Gaillard,et al.  A Chaining Algorithm for Online Nonparametric Regression , 2015, COLT.

[20]  Ambuj Tewari,et al.  Sequential complexities and uniform martingale laws of large numbers , 2015 .

[21]  Karthik Sridharan,et al.  Online Nonparametric Regression with General Loss Functions , 2015, ArXiv.

[22]  Ambuj Tewari,et al.  Online learning via sequential complexities , 2010, J. Mach. Learn. Res..

[23]  Heiko Röglin,et al.  Smoothed Analysis of Local Search for the Maximum-Cut Problem , 2017, ACM Trans. Algorithms.

[24]  Ambuj Tewari,et al.  Online Linear Optimization via Smoothing , 2014, COLT.

[25]  Karthik Sridharan,et al.  Statistical Learning and Sequential Prediction , 2014 .

[26]  Karthik Sridharan,et al.  Empirical Entropy, Minimax Regret and Minimax Risk , 2013, ArXiv.

[27]  Luc Devroye,et al.  Prediction by random-walk perturbation , 2013, COLT.

[28]  Karthik Sridharan,et al.  Online Learning with Predictable Sequences , 2012, COLT.

[29]  Sham M. Kakade,et al.  Stochastic Convex Optimization with Bandit Feedback , 2011, SIAM J. Optim..

[30]  Ohad Shamir,et al.  Relax and Randomize : From Value to Algorithms , 2012, NIPS.

[31]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[32]  Ambuj Tewari,et al.  Online Learning: Stochastic, Constrained, and Smoothed Adversaries , 2011, NIPS.

[33]  Ambuj Tewari,et al.  Smoothness, Low Noise and Fast Rates , 2010, NIPS.

[34]  Sanjeev Arora,et al.  Computational Complexity: A Modern Approach , 2009 .

[35]  Manfred K. Warmuth,et al.  Learning Permutations with Exponential Weights , 2007, COLT.

[36]  Sergei Vassilvitskii,et al.  Worst-case and Smoothed Analysis of the ICP Algorithm, with an Application to the k-means Method , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[37]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[38]  René Beier,et al.  Typical properties of winners and losers in discrete optimization , 2004, STOC '04.

[39]  M. Rudelson,et al.  Combinatorics of random processes and sections of convex bodies , 2004, math/0404192.

[40]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[41]  Scott Aaronson,et al.  Lower bounds for local search by quantum arguments , 2003, STOC '04.

[42]  Claudio Gentile,et al.  On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[43]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[44]  Shahar Mendelson,et al.  Rademacher averages and phase transitions in Glivenko-Cantelli classes , 2002, IEEE Trans. Inf. Theory.

[45]  Manfred K. Warmuth,et al.  Path Kernels and Multiplicative Updates , 2002, J. Mach. Learn. Res..

[46]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[47]  G. Lugosi,et al.  On Prediction of Individual Sequences , 1998 .

[48]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[49]  Philip M. Long,et al.  Fat-shattering and the learnability of real-valued functions , 1994, COLT '94.

[50]  Robert E. Schapire,et al.  Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[51]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[52]  D. Aldous Minimization Algorithms and Random Walk on the $d$-Cube , 1983 .

[53]  X. Fernique Regularite des trajectoires des fonctions aleatoires gaussiennes , 1975 .

[54]  Norbert Sauer,et al.  On the Density of Families of Sets , 1972, J. Comb. Theory A.

[55]  S. Shelah A combinatorial problem; stability and order for models and theories in infinitary languages. , 1972 .

[56]  V. Klee,et al.  HOW GOOD IS THE SIMPLEX ALGORITHM , 1970 .

[57]  Shun-ichi Amari,et al.  A Theory of Pattern Recognition , 1968 .

[58]  R. Dudley The Sizes of Compact Subsets of Hilbert Space and Continuity of Gaussian Processes , 1967 .

[59]  P. Meyer,et al.  Existence et unicité des représentations intégrales dans les convexes compacts quelconques , 1963 .

[60]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .