Optimal Robust Linear Regression in Nearly Linear Time

We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = \langle X,w^* \rangle + \epsilon$ (with $X \in \mathbb{R}^d$ and $\epsilon$ independent), in which an $\eta$ fraction of the samples have been adversarially corrupted. We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $\mathbb{E} [XX^\top]$ has bounded condition number and $\epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $\epsilon$ is sub-Gaussian. In both settings, our estimators: (a) Achieve optimal sample complexities and recovery guarantees up to log factors and (b) Run in near linear time ($\tilde{O}(nd / \eta^6)$). Prior to our work, polynomial time algorithms achieving near optimal sample complexities were only known in the setting where $X$ is Gaussian with identity covariance and $\epsilon$ is Gaussian, and no linear time estimators were known for robust linear regression in any setting. Our estimators and their analysis leverage recent developments in the construction of faster algorithms for robust mean estimation to improve runtimes, and refined concentration of measure arguments alongside Gaussian rounding techniques to improve statistical sample complexities.

[1]  Adrien-Marie Legendre,et al.  Nouvelles méthodes pour la détermination des orbites des comètes , 1970 .

[2]  J. Tukey Mathematics and the Picturing of Data , 1975 .

[3]  J. Kuelbs Probability on Banach spaces , 1978 .

[4]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[5]  G. J. Babu,et al.  Linear regression in astronomy. II , 1990 .

[6]  Henry W. Altland,et al.  Applied Regression Analysis for Business and Economics , 2002, Technometrics.

[7]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[8]  Richard Peng,et al.  Faster and simpler width-independent parallel algorithms for positive semidefinite programming , 2012, SPAA '12.

[9]  K. Bhaskaran,et al.  Time series regression studies in environmental epidemiology , 2013, International journal of epidemiology.

[10]  Roberto Imbuzeiro Oliveira,et al.  The lower tail of random quadratic forms with applications to ordinary least squares , 2013, ArXiv.

[11]  Joel A. Tropp,et al.  An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..

[12]  Prateek Jain,et al.  Robust Regression via Hard Thresholding , 2015, NIPS.

[13]  Santosh S. Vempala,et al.  Agnostic Estimation of Mean and Covariance , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[14]  S. Mendelson,et al.  Performance of empirical risk minimization in linear aggregation , 2014, 1402.5763.

[15]  Daniel M. Kane,et al.  Robust Estimators in High Dimensions without the Computational Intractability , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[16]  Jerry Li,et al.  Being Robust (in High Dimensions) Can Be Practical , 2017, ICML.

[17]  Prateek Jain,et al.  Consistent Robust Regression , 2017, NIPS.

[18]  Jerry Li,et al.  Computationally Efficient Robust Sparse Estimation in High Dimensions , 2017, COLT.

[19]  Gregory Valiant,et al.  Learning from untrusted data , 2016, STOC.

[20]  Chao Gao Robust regression via mutivariate regression depth , 2017, Bernoulli.

[21]  Jerry Li,et al.  Robustly Learning a Gaussian: Getting Optimal Error, Efficiently , 2017, SODA.

[22]  Gregory Valiant,et al.  Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers , 2017, ITCS.

[23]  Jerry Li,et al.  Mixture models, robustness, and sum of squares proofs , 2017, STOC.

[24]  Daniel M. Kane,et al.  List-decodable robust mean estimation and learning mixtures of spherical gaussians , 2017, STOC.

[25]  Pravesh Kothari,et al.  Efficient Algorithms for Outlier-Robust Regression , 2018, COLT.

[26]  Pravesh Kothari,et al.  Robust moment estimation and improved clustering via sum of squares , 2018, STOC.

[27]  Sivaraman Balakrishnan,et al.  Robust estimation via robust gradient estimation , 2018, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[28]  Daniel M. Kane,et al.  Recent Advances in Algorithmic High-Dimensional Robust Statistics , 2019, ArXiv.

[29]  Yu Cheng,et al.  High-Dimensional Robust Mean Estimation in Nearly-Linear Time , 2018, SODA.

[30]  Ilias Diakonikolas,et al.  Efficient Algorithms and Lower Bounds for Robust Linear Regression , 2018, SODA.

[31]  Pradeep Ravikumar,et al.  Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression , 2019, COLT.

[32]  Samuel B. Hopkins,et al.  Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection , 2019, NeurIPS.

[33]  Jerry Li,et al.  Sever: A Robust Meta-Algorithm for Stochastic Optimization , 2018, ICML.

[34]  Ainesh Bakshi,et al.  Robust linear regression: optimal rates in polynomial time , 2020, STOC.

[35]  Banghua Zhu,et al.  Robust estimation via generalized quasi-gradients , 2020, Information and Inference: A Journal of the IMA.

[36]  G. Lecu'e,et al.  Robust sub-Gaussian estimation of a mean vector in nearly linear time , 2019, The Annals of Statistics.