Global Convergence of EM Algorithm for Mixtures of Two Component Linear Regression

The Expectation-Maximization algorithm is perhaps the most broadly used algorithm for inference of latent variable problems. A theoretical understanding of its performance, however, largely remains lacking. Recent results established that EM enjoys global convergence for Gaussian Mixture Models. For Mixed Linear Regression, however, only local convergence results have been established, and those only for the high SNR regime. We show here that EM converges for mixed linear regression with two components (it is known that it may fail to converge for three or more), and moreover that this convergence holds for random initialization. Our analysis reveals that EM exhibits very different behavior in Mixed Linear Regression from its behavior in Gaussian Mixture Models, and hence our proofs require the development of several new ideas.

[1]  Constantine Caramanis,et al.  Regularized EM Algorithms: A Unified Framework and Statistical Guarantees , 2015, NIPS.

[2]  Constantine Caramanis,et al.  Solving a Mixture of Many Random Linear Equations by Tensor Decomposition and Alternating Minimization , 2016, ArXiv.

[3]  Constantine Caramanis,et al.  A Convex Formulation for Mixed Regression: Near Optimal Rates in the Face of Noise , 2013, ArXiv.

[4]  Inderjit S. Dhillon,et al.  Mixed Linear Regression with Multiple Components , 2016, NIPS.

[5]  Christos Tzamos,et al.  Ten Steps of EM Suffice for Mixtures of Two Gaussians , 2016, COLT.

[6]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[7]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8]  Martin J. Wainwright,et al.  Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.

[9]  Percy Liang,et al.  Spectral Experts for Estimating Mixtures of Linear Regressions , 2013, ICML.

[10]  Anima Anandkumar,et al.  Provable Tensor Methods for Learning Mixtures of Generalized Linear Models , 2014, AISTATS.

[11]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[12]  R. D. Veaux,et al.  Mixtures of linear regressions , 1989 .

[13]  Can Yang,et al.  On the Convergence of the EM Algorithm: From the Statistical Perspective , 2016 .

[14]  Arian Maleki,et al.  Global Analysis of Expectation Maximization for Mixtures of Two Gaussians , 2016, NIPS.

[15]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[16]  Jason M. Klusowski,et al.  Estimating the Coefficients of a Mixture of Two Linear Regressions by Expectation Maximization , 2017, IEEE Transactions on Information Theory.

[17]  Martin J. Wainwright,et al.  High-Dimensional Statistics , 2019 .

[18]  Can Yang,et al.  On the Convergence of the EM Algorithm: A Data-Adaptive Analysis , 2016, 1611.00519.

[19]  Constantine Caramanis,et al.  Alternating Minimization for Mixed Linear Regression , 2013, ICML.

[20]  Yuanzhi Li,et al.  Learning Mixtures of Linear Regressions with Nearly Optimal Complexity , 2018, COLT.

[21]  Martin J. Wainwright,et al.  Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences , 2016, NIPS.