Improving the vector $$\varepsilon $$ε acceleration for the EM algorithm using a re-starting procedure

The expectation–maximization (EM) algorithm is a popular algorithm for finding maximum likelihood estimates from incomplete data. However, the EM algorithm converges slowly when the proportion of missing data is large. Although many acceleration algorithms have been proposed, they require complex calculations. Kuroda and Sakakihara (Comput Stat Data Anal 51:1549–1561, 2006) developed the $$\varepsilon $$ε-accelerated EM algorithm which only uses the sequence of estimates obtained by the EM algorithm to get an accelerated sequence for the EM sequence but does not change the original EM sequence. We find that the accelerated sequence often has larger values of the likelihood than the current estimate obtained by the EM algorithm. Thus, in this paper, we try to re-start the EM iterations using the accelerated sequence and then generate a new EM sequence that increases its speed of convergence. This algorithm has another advantage of simple implementation since it only uses the EM iterations and re-starts the iterations by an estimate with a larger likelihood. The re-starting algorithm called the $$\varepsilon $$εR-accelerated EM algorithm can further improve the EM algorithm and the $$\varepsilon $$ε-accelerated EM algorithm in the sense of that it can reduces the number of iterations and computation time.

[1]  P. Wynn,et al.  Acceleration techniques for iterated vector and matrix problems : (mathematics of computation, _1_6(1962), nr 79, p 301-322) , 1962 .

[2]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[3]  Jill P. Mesirov,et al.  Automated High-Dimensional Flow Cytometric Data Analysis , 2010, RECOMB.

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  William Finn,et al.  Statistical file matching of flow cytometry data , 2010, J. Biomed. Informatics.

[6]  R. Jennrich,et al.  Conjugate Gradient Acceleration of the EM Algorithm , 1993 .

[7]  G. W. Snedecor Statistical Methods , 1964 .

[8]  Andrew P. Robinson,et al.  Introduction to Scientific Programming and Simulation Using R , 2014 .

[9]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[10]  Tsung I. Lin,et al.  Maximum likelihood estimation for multivariate skew normal mixture models , 2009, J. Multivar. Anal..

[11]  Christophe Biernacki,et al.  Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models , 2003, Comput. Stat. Data Anal..

[12]  Gyemin Lee,et al.  EM algorithms for multivariate Gaussian mixture models with truncated and censored data , 2012, Comput. Stat. Data Anal..

[13]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[14]  Xiao-Li Meng,et al.  On the global and componentwise rates of convergence of the EM algorithm , 1994 .

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  R. Jennrich,et al.  Acceleration of the EM Algorithm by using Quasi‐Newton Methods , 1997 .

[17]  Dimitris Karlis,et al.  Choosing Initial Values for the EM Algorithm for Finite Mixtures , 2003, Comput. Stat. Data Anal..

[18]  N. Laird,et al.  Maximum likelihood computations with repeated measures: application of the EM algorithm , 1987 .

[19]  Masahiro Kurodaa,et al.  Accelerating the convergence of the EM algorithm using the vector algorithm , 2006 .

[20]  Masahiro Kuroda,et al.  Acceleration of the EM algorithm using the vector epsilon algorithm , 2008, Comput. Stat..

[21]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.