Outlier-robust estimation of a sparse linear model using 𝓁1-penalized Huber's M-estimator

We study the problem of estimating a $p$-dimensional $s$-sparse vector in a linear model with Gaussian design and additive noise. In the case where the labels are contaminated by at most $o$ adversarial outliers, we prove that the $\ell_1$-penalized Huber's $M$-estimator based on $n$ samples attains the optimal rate of convergence $(s/n)^{1/2} + (o/n)$, up to a logarithmic factor. For more general design matrices, our results highlight the importance of two properties: the transfer principle and the incoherence property. These properties with suitable constants are shown to yield the optimal rates, up to log-factors, of robust estimation with adversarial contamination.

[1]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[2]  Ilias Diakonikolas,et al.  Efficient Algorithms and Lower Bounds for Robust Linear Regression , 2018, SODA.

[3]  Ben Adcock,et al.  Compressed Sensing with Sparse Corruptions: Fault-Tolerant Sparse Collocation Approximations , 2017, SIAM/ASA J. Uncertain. Quantification.

[4]  Richard G. Baraniuk,et al.  Exact signal recovery from sparsely corrupted measurements through the Pursuit of Justice , 2009, 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers.

[5]  Yu Cheng,et al.  High-Dimensional Robust Mean Estimation in Nearly-Linear Time , 2018, SODA.

[6]  Paul Tseng,et al.  Robust wavelet denoising , 2001, IEEE Trans. Signal Process..

[7]  C. Jennison,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[8]  Jerry Li,et al.  Computationally Efficient Robust Sparse Estimation in High Dimensions , 2017, COLT.

[9]  S. MacEachern,et al.  Regularization of Case-Specific Parameters for Robustness and Efficiency , 2012, 1210.0701.

[10]  Matthieu Lerasle,et al.  ROBUST MACHINE LEARNING BY MEDIAN-OF-MEANS: THEORY AND PRACTICE , 2019 .

[11]  Liu Liu,et al.  High Dimensional Robust Sparse Regression , 2018, AISTATS.

[12]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[13]  Trac D. Tran,et al.  Robust Lasso With Missing and Grossly Corrupted Observations , 2011, IEEE Transactions on Information Theory.

[14]  A. Tsybakov,et al.  Slope meets Lasso: Improved oracle bounds and optimality , 2016, The Annals of Statistics.

[15]  Chao Gao,et al.  Robust covariance and scatter matrix estimation under Huber’s contamination model , 2015, The Annals of Statistics.

[16]  Y. Yatracos Rates of Convergence of Minimum Distance Estimators and Kolmogorov's Entropy , 1985 .

[17]  Rina Foygel,et al.  Corrupted Sensing: Novel Guarantees for Separating Structured Signals , 2013, IEEE Transactions on Information Theory.

[18]  Stanislav Minsker Sub-Gaussian estimators of the mean of a random matrix with heavy-tailed entries , 2016, The Annals of Statistics.

[19]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[20]  Prateek Jain,et al.  Consistent Robust Regression , 2017, NIPS.

[21]  Prateek Jain,et al.  Robust Regression via Hard Thresholding , 2015, NIPS.

[22]  Martin J. Wainwright,et al.  Restricted Eigenvalue Properties for Correlated Gaussian Designs , 2010, J. Mach. Learn. Res..

[23]  Shie Mannor,et al.  Robust Sparse Regression under Adversarial Corruption , 2013, ICML.

[24]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[25]  Chao Gao Robust regression via mutivariate regression depth , 2017, Bernoulli.

[26]  Liu Liu,et al.  High Dimensional Robust Estimation of Sparse Models via Trimmed Hard Thresholding , 2019, ArXiv.

[27]  Cun-Hui Zhang,et al.  Sparse matrix inversion with scaled Lasso , 2012, J. Mach. Learn. Res..

[28]  A. Dalalyan,et al.  Minimax estimation of a p-dimensional linear functional in sparse Gaussian models and robust estimation of the mean , 2017, 1712.05495.

[29]  Yiyuan She,et al.  Outlier Detection Using Nonconvex Penalized Regression , 2010, ArXiv.

[30]  Andrea Montanari,et al.  High dimensional robust M-estimation: asymptotic variance via approximate message passing , 2013, Probability Theory and Related Fields.

[31]  Santosh S. Vempala,et al.  Agnostic Estimation of Mean and Covariance , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[32]  Pradeep Ravikumar,et al.  Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression , 2019, COLT.

[33]  Eric Price,et al.  Compressed Sensing with Adversarial Sparse Noise via L1 Regression , 2018, SOSA.

[34]  Xiaodong Li,et al.  Compressed Sensing and Matrix Completion with Constant Proportion of Corruptions , 2011, Constructive Approximation.

[35]  Roberto Imbuzeiro Oliveira,et al.  The lower tail of random quadratic forms with applications to ordinary least squares , 2013, ArXiv.

[36]  Constantine Caramanis,et al.  Robust estimation of tree structured Gaussian Graphical Model , 2019, ICML.

[37]  O. Catoni Challenging the empirical mean and empirical variance: a deviation study , 2010, 1009.2048.

[38]  Emmanuel J. Candès,et al.  Highly Robust Error Correction byConvex Programming , 2006, IEEE Transactions on Information Theory.

[39]  Arnak S. Dalalyan,et al.  Rate-optimal estimation of p-dimensional linear functionals in a sparse Gaussian model , 2018 .

[40]  G. Lugosi,et al.  Sub-Gaussian estimators of the mean of a random vector , 2017, The Annals of Statistics.

[41]  O. Papaspiliopoulos High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .

[42]  Arkadi Nemirovski,et al.  Accuracy Guarantees for ℓ1-Recovery , 2010, IEEE Trans. Inf. Theory.

[43]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[44]  Weixin Yao,et al.  Robust linear regression: A review and comparison , 2014, Commun. Stat. Simul. Comput..

[45]  G. Lugosi,et al.  Sub-Gaussian mean estimators , 2015, 1509.05845.

[46]  Yin Chen,et al.  Fused sparsity and robust estimation for linear models with unknown variance , 2012, NIPS.

[47]  Jerry Li,et al.  Robustly Learning a Gaussian: Getting Optimal Error, Efficiently , 2017, SODA.

[48]  Daniel M. Kane,et al.  Robust Estimators in High Dimensions without the Computational Intractability , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[49]  Shuheng Zhou,et al.  25th Annual Conference on Learning Theory Reconstruction from Anisotropic Random Measurements , 2022 .

[50]  Pierre C Bellec,et al.  Localized Gaussian width of $M$-convex hulls with applications to Lasso and convex aggregation , 2017, Bernoulli.

[51]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[52]  A. Dalalyan,et al.  Convex programming approach to robust estimation of a multivariate Gaussian model , 2015, 1512.04734.