Learning Entangled Single-Sample Distributions via Iterative Trimming

In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of distributions, given one \emph{single} sample from each distribution. We study mean estimation and linear regression under general conditions, and analyze a simple and computationally efficient method based on iteratively trimming samples and re-estimating the parameter on the trimmed sample set. We show that the method in logarithmic iterations outputs an estimation whose error only depends on the noise level of the $\lceil \alpha n \rceil$-th noisiest data point where $\alpha$ is a constant and $n$ is the sample size. This means it can tolerate a constant fraction of high-noise points. These are the first such results for the method under our general conditions. It also justifies the wide application and empirical success of iterative trimming in practice. Our theoretical results are complemented by experiments on synthetic data.

[1]  P. Bickel On Some Robust Estimates of Location , 1965 .

[2]  Leslie G. Valiant,et al.  Learning Disjunction of Conjunctions , 1985, IJCAI.

[3]  Varun Jog,et al.  Estimating location parameters in entangled single-sample distributions , 2019, ArXiv.

[4]  Prateek Jain,et al.  Robust Regression via Hard Thresholding , 2015, NIPS.

[5]  J. Tukey,et al.  LESS VULNERABLE CONFIDENCE AND SIGNIFICANCE PROCEDURES FOR LOCATION BASED ON A SINGLE SAMPLE : TRIMMING/WINSORIZATION 1 , 2016 .

[6]  Jerry Li,et al.  Sever: A Robust Meta-Algorithm for Stochastic Optimization , 2018, ICML.

[7]  Jerry Li,et al.  Robustly Learning a Gaussian: Getting Optimal Error, Efficiently , 2017, SODA.

[8]  Daniel M. Kane,et al.  Robust Estimators in High Dimensions without the Computational Intractability , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[9]  Ilias Diakonikolas,et al.  Efficient Algorithms and Lower Bounds for Robust Linear Regression , 2018, SODA.

[10]  Yu Cheng,et al.  High-Dimensional Robust Mean Estimation in Nearly-Linear Time , 2018, SODA.

[11]  Adam Tauman Kalai,et al.  Efficiently learning mixtures of two Gaussians , 2010, STOC '10.

[12]  Jan Ámos Víšek,et al.  Least trimmed squares , 2000 .

[13]  Shie Mannor,et al.  Robust Sparse Regression under Adversarial Corruption , 2013, ICML.

[14]  PETER J. ROUSSEEUW,et al.  Computing LTS Regression for Large Data Sets , 2005, Data Mining and Knowledge Discovery.

[15]  O. Hössjer Exact computation of the least trimmed squares estimate in simple linear regression , 1995 .

[16]  D. B. Duncan,et al.  Estimating Heteroscedastic Variances in Linear Models , 1975 .

[17]  Sanjeev Arora,et al.  Learning mixtures of arbitrary gaussians , 2001, STOC '01.

[18]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[19]  Fumin Shen,et al.  Approximate Least Trimmed Sum of Squares Fitting and Applications in Image Analysis , 2013, IEEE Transactions on Image Processing.

[20]  Mikhail Belkin,et al.  Polynomial Learning of Distribution Families , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[21]  Santosh S. Vempala,et al.  The Spectral Method for General Mixture Models , 2008, SIAM J. Comput..

[22]  Mikhail Belkin,et al.  Toward Learning Gaussian Mixtures with Arbitrary Separation , 2010, COLT.

[23]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[24]  Shie Mannor,et al.  Ignoring Is a Bliss: Learning with Large Noise Through Reweighting-Minimization , 2017, COLT.

[25]  Liu Liu,et al.  High Dimensional Robust Sparse Regression , 2018, AISTATS.

[26]  Jerry Li,et al.  Being Robust (in High Dimensions) Can Be Practical , 2017, ICML.

[27]  Pravesh Kothari,et al.  Efficient Algorithms for Outlier-Robust Regression , 2018, COLT.

[28]  Ankur Moitra,et al.  Settling the Polynomial Learnability of Mixtures of Gaussians , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[29]  Eunho Yang,et al.  High-Dimensional Trimmed Estimators: A General Framework for Robust Structured Estimation , 2016, 1605.08299.

[30]  David M. Mount,et al.  On the Least Trimmed Squares Estimator , 2012, Algorithmica.

[31]  Jerry Li,et al.  Computationally Efficient Robust Sparse Estimation in High Dimensions , 2017, COLT.

[32]  Sivaraman Balakrishnan,et al.  Robust estimation via robust gradient estimation , 2018, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[33]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[34]  Calyampudi R. Rao Estimation of Heteroscedastic Variances in Linear Models , 1970 .

[35]  Yanyao Shen,et al.  Learning with Bad Training Data via Iterative Trimmed Loss Minimization , 2018, ICML.

[36]  Dimitris Achlioptas,et al.  On Spectral Learning of Mixtures of Distributions , 2005, COLT.

[37]  D. B. Duncan,et al.  Estimating Heteroscedastic Variances in Linear Models - A Simpler Approach. , 1973 .