Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm

Abstract.  The expectation‐maximization (EM) algorithm is a popular approach for obtaining maximum likelihood estimates in incomplete data problems because of its simplicity and stability (e.g. monotonic increase of likelihood). However, in many applications the stability of EM is attained at the expense of slow, linear convergence. We have developed a new class of iterative schemes, called squared iterative methods (SQUAREM), to accelerate EM, without compromising on simplicity and stability. SQUAREM generally achieves superlinear convergence in problems with a large fraction of missing information. Globally convergent schemes are easily obtained by viewing SQUAREM as a continuation of EM. SQUAREM is especially attractive in high‐dimensional problems, and in problems where model‐specific analytic insights are not available. SQUAREM can be readily implemented as an ‘off‐the‐shelf’ accelerator of any EM‐type algorithm, as it only requires the EM parameter updating. We present four examples to demonstrate the effectiveness of SQUAREM. A general‐purpose implementation (written in R) is available.

[1]  V. Hasselblad Finite mixtures of distributions from the exponential family , 1969 .

[2]  Peter Lancaster,et al.  The theory of matrices , 1969 .

[3]  James M. Ortega,et al.  Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[4]  J. Ortega Stability of Difference Equations and Convergence of Iterative Processes , 1973 .

[5]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[6]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[7]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[8]  A. Sidi,et al.  Extrapolation methods for vector sequences , 1987 .

[9]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[10]  Y. Nievergelt Aitken's and Steffensen's accelerations in several variables , 1991 .

[11]  R. Jennrich,et al.  Conjugate Gradient Acceleration of the EM Algorithm , 1993 .

[12]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[13]  D. Rubin,et al.  The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .

[14]  D. Rubin,et al.  Parameter expansion to accelerate EM : The PX-EM algorithm , 1997 .

[15]  Xiao-Li Meng,et al.  The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune , 1997 .

[16]  C. McCulloch Maximum Likelihood Algorithms for Generalized Linear Mixed Models , 1997 .

[17]  R. Jennrich,et al.  Acceleration of the EM Algorithm by using Quasi‐Newton Methods , 1997 .

[18]  D. Rubin,et al.  Parameter expansion to accelerate EM: The PX-EM algorithm , 1998 .

[19]  J. Booth,et al.  Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .

[20]  D. Hunter,et al.  Optimization Transfer Using Surrogate Objective Functions , 2000 .

[21]  D. Rubin,et al.  Principal Stratification in Causal Inference , 2002, Biometrics.

[22]  Marcos Raydan,et al.  Relaxed Steepest Descent and Cauchy-Barzilai-Borwein Method , 2002, Comput. Optim. Appl..

[23]  Ravi Varadhan,et al.  Methodology for Evaluating a Partially Controlled Longitudinal Treatment Using Principal Stratification, With Application to a Needle Exchange Program , 2004, Journal of the American Statistical Association.

[24]  Ch. Roland,et al.  Squared Extrapolation Methods (SQUAREM): A New Class of Simple and Efficient Numerical Schemes for Accelerating the Convergence of the EM Algorithm , 2004 .

[25]  Ravi Varadhan,et al.  SYSTEMATIZING THE EVALUATION OF PARTIALLY CONTROLLED STUDIES USING PRINCIPAL STRATIFICTAION: FROM THEORY TO PRACTICE , 2004 .

[26]  Ch. Roland,et al.  New iterative schemes for nonlinear fixed point problems, with applications to problems with bifurcations and incomplete-data problems , 2005 .

[27]  Galin L. Jones,et al.  Ascent‐based Monte Carlo expectation– maximization , 2005 .

[28]  Christophe Roland,et al.  Squared polynomial extrapolation methods with cycling: an application to the positron emission tomography problem , 2007, Numerical Algorithms.

[29]  Ravi Varadhan,et al.  BB: An R Package for Solving a Large System of Nonlinear Equations and for Optimizing a High-Dimensional Nonlinear Objective Function , 2009 .