Robust linear least squares regression

We consider the problem of robustly predicting as well as the best linear combination of d given functions in least squares regression, and variants of this problem including constraints on the parameters of the linear combination. For the ridge estimator and the ordinary least squares estimator, and their variants, we provide new risk bounds of order d/n without logarithmic factor unlike some standard results, where n is the size of the training data. We also provide a new estimator with better deviations in presence of heavy-tailed noise. It is based on truncating differences of losses in a min-max framework and satisfies a d/n risk bound both in expectation and in deviations. The key common surprising factor of these results is the absence of exponential moment condition on the output distribution while achieving exponential deviations. All risk bounds are obtained through a PAC-Bayesian analysis on truncated differences of losses. Experimental results strongly back up our truncated min-max estimator.

[1]  Peter J. Rousseeuw,et al.  ROBUST REGRESSION BY MEANS OF S-ESTIMATORS , 1984 .

[2]  V. Yohai HIGH BREAKDOWN-POINT AND HIGH EFFICIENCY ROBUST ESTIMATES FOR REGRESSION , 1987 .

[3]  P. Massart,et al.  Minimum contrast estimators on sieves: exponential bounds and rates of convergence , 1998 .

[4]  Yuhong Yang Aggregating Regression Procedures for a Better Performance , 1999 .

[5]  Arkadi Nemirovski,et al.  Topics in Non-Parametric Statistics , 2000 .

[6]  Y. Baraud Model selection for regression on a fixed design , 2000 .

[7]  M. Talagrand,et al.  Lectures on Probability Theory and Statistics , 2000 .

[8]  John Shawe-Taylor,et al.  PAC-Bayes & Margins , 2002, NIPS.

[9]  A. V. D. Vaart,et al.  Lectures on probability theory and statistics , 2002 .

[10]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[11]  Alexandre B. Tsybakov,et al.  Optimal Rates of Aggregation , 2003, COLT.

[12]  John Shawe-Taylor,et al.  PAC Bayes and Margins , 2003 .

[13]  Yuhong Yang Aggregating regression procedures to improve performance , 2004 .

[14]  J. Picard,et al.  Lectures on probability theory and statistics , 2004 .

[15]  Marie Sauvé,et al.  Piecewise Polynomial Estimation of a Regression Function , 2010, IEEE Transactions on Information Theory.

[16]  Jean-Yves Audibert,et al.  Robust linear regression through PAC-Bayesian truncation , 2010 .

[17]  O. Catoni Challenging the empirical mean and empirical variance: a deviation study , 2010, 1009.2048.

[18]  Jean-Yves Audibert,et al.  Supplement to "Robust linear least squares regression" , 2011 .

[19]  Jean-Yves Audibert,et al.  Linear regression through PAC-Bayesian truncation , 2010, 1010.0072.