Mean estimation with sub-Gaussian rates in polynomial time

We study polynomial time algorithms for estimating the mean of a heavy-tailed multivariate random vector. We assume only that the random vector $X$ has finite mean and covariance. In this setting, the radius of confidence intervals achieved by the empirical mean are large compared to the case that $X$ is Gaussian or sub-Gaussian. We offer the first polynomial time algorithm to estimate the mean with sub-Gaussian-size confidence intervals under such mild assumptions. Our algorithm is based on a new semidefinite programming relaxation of a high-dimensional median. Previous estimators which assumed only existence of finitely-many moments of $X$ either sacrifice sub-Gaussian performance or are only known to be computable via brute-force search procedures requiring time exponential in the dimension.

[1]  Peter L. Bartlett,et al.  Fast Mean Estimation with Sub-Gaussian Rates , 2019, COLT.

[2]  Stanislav Minsker Uniform bounds for robust mean estimators , 2018, 1812.03523.

[3]  Prasad Raghavendra,et al.  High-dimensional estimation via sum-of-squares proofs , 2018, Proceedings of the International Congress of Mathematicians (ICM 2018).

[4]  Pravesh Kothari,et al.  Robust moment estimation and improved clustering via sum of squares , 2018, STOC.

[5]  Pravesh Kothari,et al.  Efficient Algorithms for Outlier-Robust Regression , 2018, COLT.

[6]  O. Catoni,et al.  Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector , 2018, 1802.04308.

[7]  Jerry Li,et al.  Mixture models, robustness, and sum of squares proofs , 2017, STOC.

[8]  Prasad Raghavendra,et al.  The Power of Sum-of-Squares for Detecting Hidden Structures , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[9]  David Steurer,et al.  Efficient Bayesian Estimation from Few Samples: Community Detection and Related Problems , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[10]  Samuel B. Hopkins,et al.  Bayesian estimation from few samples: community detection and related problems , 2017, ArXiv.

[11]  Tselil Schramm,et al.  Fast and robust tensor decomposition with applications to dictionary learning , 2017, COLT.

[12]  Jess Banks,et al.  The Lovász Theta Function for Random Regular Graphs and Community Detection in the Hard Regime , 2017, APPROX-RANDOM.

[13]  David Steurer,et al.  Exact tensor completion with sum-of-squares , 2017, COLT.

[14]  G. Lugosi,et al.  Sub-Gaussian estimators of the mean of a random vector , 2017, The Annals of Statistics.

[15]  Prasad Raghavendra,et al.  On the Bit Complexity of Sum-of-Squares Proofs , 2017, ICALP.

[16]  Tengyu Ma,et al.  Polynomial-Time Tensor Decompositions with Sum-of-Squares , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[17]  Jakub W. Pachocki,et al.  Geometric median in nearly linear time , 2016, STOC.

[18]  Sanjeev Arora,et al.  A Combinatorial, Primal-Dual Approach to Semidefinite Programs , 2016, J. ACM.

[19]  Santosh S. Vempala,et al.  Agnostic Estimation of Mean and Covariance , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[20]  Daniel M. Kane,et al.  Robust Estimators in High Dimensions without the Computational Intractability , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[21]  Tselil Schramm,et al.  Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors , 2015, STOC.

[22]  G. Lugosi,et al.  Sub-Gaussian mean estimators , 2015, 1509.05845.

[23]  Avi Wigderson,et al.  Sum-of-Squares Lower Bounds for Sparse PCA , 2015, NIPS.

[24]  Andrea Montanari,et al.  Semidefinite programs on sparse random graphs and their application to community detection , 2015, STOC.

[25]  Ankur Moitra,et al.  Noisy tensor completion via the sum-of-squares hierarchy , 2015, Mathematical Programming.

[26]  Roman Vershynin,et al.  Community detection in sparse networks via Grothendieck’s inequality , 2014, Probability Theory and Related Fields.

[27]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[28]  Moritz Hardt,et al.  Understanding Alternating Minimization for Matrix Completion , 2013, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[29]  Stanislav Minsker Geometric median and robust estimation in Banach spaces , 2013, 1308.1334.

[30]  Daniel J. Hsu,et al.  Loss Minimization and Parameter Estimation with Heavy Tails , 2013, J. Mach. Learn. Res..

[31]  Philippe Rigollet,et al.  Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[32]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[33]  M. Lerasle,et al.  ROBUST EMPIRICAL MEAN ESTIMATORS , 2011, 1112.3914.

[34]  David P. Williamson,et al.  The Design of Approximation Algorithms , 2011 .

[35]  O. Catoni Challenging the empirical mean and empirical variance: a deviation study , 2010, 1009.2048.

[36]  David Steurer,et al.  Fast SDP algorithms for constraint satisfaction problems , 2010, SODA '10.

[37]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[38]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[39]  M. Wainwright,et al.  High-dimensional analysis of semidefinite relaxations for sparse principal components , 2008, 2008 IEEE International Symposium on Information Theory.

[40]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[41]  Noga Alon,et al.  Approximating the cut-norm via Grothendieck's inequality , 2004, STOC '04.

[42]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[43]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[44]  Jean B. Lasserre,et al.  Global Optimization with Polynomials and the Problem of Moments , 2000, SIAM J. Optim..

[45]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[46]  William J. Cook,et al.  Combinatorial optimization , 1997 .

[47]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[48]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[49]  Ludek Kucera,et al.  Expected Complexity of Graph Partitioning Problems , 1995, Discret. Appl. Math..

[50]  Leslie G. Valiant,et al.  Random Generation of Combinatorial Structures from a Uniform Distribution , 1986, Theor. Comput. Sci..

[51]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[52]  Yurii Nesterov,et al.  Squared Functional Systems and Optimization Problems , 2000 .

[53]  P. Parrilo Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization , 2000 .

[54]  H. Do,et al.  Data Cleaning: Problems and Current Approaches , 2000 .

[55]  Y. Nesterov Semidefinite relaxation and nonconvex quadratic optimization , 1998 .

[56]  N. Z. Shor An approach to obtaining global extremums in polynomial mathematical programming problems , 1987 .

[57]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[58]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .