A Convex Formulation for Mixed Regression: Near Optimal Rates in the Face of Noise

We consider the mixed regression problem with two components, under adversarial and stochastic noise. We give a convex optimization formulation that provably recovers the true solution, and provide upper bounds on the recovery errors for both arbitrary noise and stochastic noise settings. We also give matching minimax lower bounds (up to log factors), showing that under certain assumptions, our algorithm is information-theoretically optimal. Our results represent the first tractable algorithm guaranteeing successful recovery with tight bounds on recovery errors and sample complexity.

[1]  Ben Taskar,et al.  An End-to-End Discriminative Approach to Machine Translation , 2006, ACL.

[2]  Stratis Ioannidis,et al.  Learning Mixtures of Linear Classifiers , 2014, ICML.

[3]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[4]  Emmanuel J. Candès,et al.  PhaseLift: Exact and Stable Signal Recovery from Magnitude Measurements via Convex Programming , 2011, ArXiv.

[5]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[6]  Kert Viele,et al.  Modeling with Mixtures of Linear Regressions , 2002, Stat. Comput..

[7]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[8]  Trevor Darrell,et al.  Conditional Random Fields for Object Recognition , 2004, NIPS.

[9]  Constantine Caramanis,et al.  Alternating Minimization for Mixed Linear Regression , 2013, ICML.

[10]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[11]  Partha Deb,et al.  Is prenatal care really ineffective? Or, is the 'devil' in the distribution? , 2005, Journal of health economics.

[12]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[13]  Yonina C. Eldar,et al.  Phase Retrieval via Matrix Completion , 2011, SIAM Rev..

[14]  Jiahua Chen Optimal Rate of Convergence for Finite Mixture Models , 1995 .

[15]  Xiaodong Li,et al.  Solving Quadratic Equations via PhaseLift When There Are About as Many Equations as Unknowns , 2012, Found. Comput. Math..

[16]  Sham M. Kakade,et al.  Learning Gaussian Mixture Models: Moment Methods and Spectral Decompositions , 2012, arXiv.org.

[17]  M. Rudelson,et al.  Hanson-Wright inequality and sub-gaussian concentration , 2013 .

[18]  Yonina C. Eldar,et al.  An algorithm for exact super-resolution and phase retrieval , 2013, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Larry A. Wasserman,et al.  Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation , 2013, NIPS.

[20]  F. Leisch,et al.  Applications of nite mixtures of regression models , 2006 .

[21]  Lucien Birgé Approximation dans les espaces métriques et théorie de l'estimation , 1983 .

[22]  Anru Zhang,et al.  ROP: Matrix Recovery via Rank-One Projections , 2013, ArXiv.

[23]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[24]  Percy Liang,et al.  Spectral Experts for Estimating Mixtures of Linear Regressions , 2013, ICML.

[25]  Andrea J. Goldsmith,et al.  Exact and Stable Covariance Estimation From Quadratic Sampling via Convex Programming , 2013, IEEE Transactions on Information Theory.

[26]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[27]  Emmanuel J. Candès,et al.  Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements , 2011, IEEE Transactions on Information Theory.

[28]  Yuhong Yang,et al.  Information-theoretic determination of minimax rates of convergence , 1999 .

[29]  Emmanuel J. Candès,et al.  Robust Subspace Clustering , 2013, ArXiv.

[30]  Martin J. Wainwright,et al.  Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.

[31]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[32]  Huan Xu,et al.  Noisy Sparse Subspace Clustering , 2013, J. Mach. Learn. Res..

[33]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[34]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[35]  S. Geer,et al.  ℓ1-penalization for mixture regression models , 2010, 1202.6046.

[36]  Prateek Jain,et al.  Phase Retrieval Using Alternating Minimization , 2013, IEEE Transactions on Signal Processing.