Online Nonparametric Regression with General Loss Functions

This paper establishes minimax rates for online regression with arbitrary classes of functions and general losses. 1 We show that below a certain threshold for the complexity of the function class, the minimax rates dependon both thecurvatureof the lossfunction andthe sequentialcomplexities of theclass. Abovethis threshold, the curvatureof the loss does not affect the rates. Furthermore,for the case of square loss, our results point to the interesting phenomenon: whenever sequential and i.i.d. empirical entropies match, the rates for statistical and online learning are the same. In addition to the study of minimax regret, we derive a generic forecaster that enjoys the established optimal rates. We also provide a recipe for designing online prediction algorithms that can be computationally efficient for certain problems. We illustrate the techniques by deriving existing and new forecasters for the case of finite experts and for online linear regression.

[1]  E. Giné,et al.  Some Limit Theorems for Empirical Processes , 1984 .

[2]  Karthik Sridharan,et al.  Statistical Learning and Sequential Prediction , 2014 .

[3]  Vladimir Vovk,et al.  Competitive On-line Linear Regression , 1997, NIPS.

[4]  Neri Merhav,et al.  Universal Prediction , 1998, IEEE Trans. Inf. Theory.

[5]  Vladimir Vovk,et al.  Metric entropy in competitive on-line prediction , 2006, ArXiv.

[6]  Nimrod Megiddo,et al.  Online Learning with Prior Knowledge , 2007, COLT.

[7]  Karthik Sridharan,et al.  Online Nonparametric Regression , 2014, ArXiv.

[8]  Manfred K. Warmuth,et al.  Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[9]  Jia Yuan Yu,et al.  Adaptive and optimal online linear regression on l1-balls , 2011, Theor. Comput. Sci..

[10]  Ohad Shamir,et al.  Relax and Randomize : From Value to Algorithms , 2012, NIPS.

[11]  Ambuj Tewari,et al.  Online learning via sequential complexities , 2010, J. Mach. Learn. Res..

[12]  Vladimir Vovk,et al.  Competing with wild prediction rules , 2005, Machine Learning.

[13]  Karthik Sridharan,et al.  On Semi-Probabilistic universal prediction , 2013, 2013 IEEE Information Theory Workshop (ITW).

[14]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[15]  David Haussler,et al.  Sequential Prediction of Individual Sequences Under General Loss Functions , 1998, IEEE Trans. Inf. Theory.

[16]  Karthik Sridharan,et al.  Empirical Entropy, Minimax Regret and Minimax Risk , 2013, ArXiv.

[17]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[18]  Gábor Lugosi,et al.  Minimax regret under log loss for general classes of experts , 1999, COLT '99.

[19]  Vladimir Vovk,et al.  On-Line Regression Competitive with Reproducing Kernel Hilbert Spaces , 2005, TAMC.

[20]  Dean Phillips Foster Prediction in the Worst Case , 1991 .

[21]  Ambuj Tewari,et al.  Online Learning: Random Averages, Combinatorial Parameters, and Learnability , 2010, NIPS.

[22]  Peter L. Bartlett,et al.  A Stochastic View of Optimal Regret through Minimax Duality , 2009, COLT.

[23]  Ambuj Tewari,et al.  Sequential complexities and uniform martingale laws of large numbers , 2015 .

[24]  Sébastien Gerchinovitz,et al.  Sparsity Regret Bounds for Individual Sequences in Online Linear Regression , 2011, COLT.

[25]  Claudio Gentile,et al.  Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..

[26]  Karthik Sridharan,et al.  Competing With Strategies , 2013, COLT.

[27]  V. Vovk Competitive On‐line Statistics , 2001 .

[28]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[29]  Jean-Yves Audibert Fast learning rates in statistical inference through aggregation , 2007, math/0703854.

[30]  Nicolò Cesa-Bianchi,et al.  Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.

[31]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[32]  Francis R. Bach,et al.  Self-concordant analysis for logistic regression , 2009, ArXiv.