Learning Mixtures of Linear Regressions with Nearly Optimal Complexity

Mixtures of Linear Regressions (MLR) is an important mixture model with many applications. In this model, each observation is generated from one of the several unknown linear regression components, where the identity of the generated component is also unknown. Previous works either assume strong assumptions on the data distribution or have high complexity. This paper proposes a fixed parameter tractable algorithm for the problem under general conditions, which achieves global convergence and the sample complexity scales nearly linearly in the dimension. In particular, different from previous works that require the data to be from the standard Gaussian, the algorithm allows the data from Gaussians with different covariances. When the conditional number of the covariances and the number of components are fixed, the algorithm has nearly optimal sample complexity $N = \tilde{O}(d)$ as well as nearly optimal computational complexity $\tilde{O}(Nd)$, where $d$ is the dimension of the data space. To the best of our knowledge, this approach provides the first such recovery guarantee for this general setting.

[1]  Shai Ben-David,et al.  Sample-Efficient Learning of Mixtures , 2017, AAAI.

[2]  Jiahua Chen,et al.  Variable Selection in Finite Mixture of Regression Models , 2007 .

[3]  Constantine Caramanis,et al.  Solving a Mixture of Many Random Linear Equations by Tensor Decomposition and Alternating Minimization , 2016, ArXiv.

[4]  Ankur Moitra,et al.  Settling the Polynomial Learnability of Mixtures of Gaussians , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[5]  A. Appendix Alternating Minimization for Mixed Linear Regression , 2014 .

[6]  Inderjit S. Dhillon,et al.  Mixed Linear Regression with Multiple Components , 2016, NIPS.

[7]  Yuanzhi Li,et al.  Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.

[8]  Gilda Soromenho,et al.  Fitting mixtures of linear regressions , 2010 .

[9]  Jason M. Klusowski,et al.  Estimating the Coefficients of a Mixture of Two Linear Regressions by Expectation Maximization , 2017, IEEE Transactions on Information Theory.

[10]  Qingqing Huang,et al.  Learning Mixtures of Gaussians in High Dimensions , 2015, STOC.

[11]  F. Leisch,et al.  Applications of nite mixtures of regression models , 2006 .

[12]  Martin J. Wainwright,et al.  Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.

[13]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[14]  Padhraic Smyth,et al.  Trajectory clustering with mixtures of regression models , 1999, KDD '99.

[15]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[16]  Anima Anandkumar,et al.  Provable Tensor Methods for Learning Mixtures of Generalized Linear Models , 2014, AISTATS.

[17]  Yuanzhi Li,et al.  Even Faster SVD Decomposition Yet Without Agonizing Pain , 2016, NIPS.

[18]  Yuval Peres,et al.  Gravitational allocation for uniform points on the sphere , 2017, The Annals of Probability.

[19]  Percy Liang,et al.  Spectral Experts for Estimating Mixtures of Linear Regressions , 2013, ICML.

[20]  R. D. Veaux,et al.  Mixtures of linear regressions , 1989 .

[21]  Constantine Caramanis,et al.  A Convex Formulation for Mixed Regression: Near Optimal Rates in the Face of Noise , 2013, ArXiv.