Standard errors for EM estimation

The EM algorithm is a popular method for computing maximum likelihood estimates. One of its drawbacks is that it does not produce standard errors as a by‐product. We consider obtaining standard errors by numerical differentiation. Two approaches are considered. The first differentiates the Fisher score vector to yield the Hessian of the log‐likelihood. The second differentiates the EM operator and uses an identity that relates its derivative to the Hessian of the log‐likelihood. The well‐known SEM algorithm uses the second approach. We consider three additional algorithms: one that uses the first approach and two that use the second. We evaluate the complexity and precision of these three and the SEM in algorithm seven examples. The first is a single‐parameter example used to give insight. The others are three examples in each of two areas of EM application: Poisson mixture models and the estimation of covariance from incomplete data. The examples show that there are algorithms that are much simpler and more accurate than the SEM algorithm. Hopefully their simplicity will increase the availability of standard error estimates in EM applications. It is shown that, as previously conjectured, a symmetry diagnostic can accurately estimate errors arising from numerical differentiation. Some issues related to the speed of the EM algorithm and algorithms that differentiate the EM operator are identified.

[1]  Rory A. Fisher,et al.  Theory of Statistical Estimation , 1925, Mathematical Proceedings of the Cambridge Philosophical Society.

[2]  V. Hasselblad Finite mixtures of distributions from the exponential family , 1969 .

[3]  Samuel D. Conte,et al.  Elementary Numerical Analysis: An Algorithmic Approach , 1975 .

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[6]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[7]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[8]  R. Kane,et al.  Rehabilitation for rheumatoid arthritis patients. A controlled trial. , 1986, Arthritis and rheumatism.

[9]  William H. Press,et al.  Numerical Recipes: FORTRAN , 1988 .

[10]  Xiao-Li Meng,et al.  Using EM to Obtain Asymptotic Variance-Covariance Matrices: The SEM Algorithm , 1991 .

[11]  Stuart G. Baker,et al.  A Simple Method for Computing the Observed Information Matrix When Using the EM Algorithm with Categorical Data , 1992 .

[12]  R. Jennrich,et al.  Conjugate Gradient Acceleration of the EM Algorithm , 1993 .

[13]  Louise Ryan,et al.  A Three-state Multiplicative Model for Rodent Tumorigenicity Experiments , 1993 .

[14]  Xiao-Li Meng,et al.  The AIDS Epidemic: Estimating Survival After AIDS Diagnosis From Surveillance Data , 1993 .

[15]  X M Tu,et al.  Regression analysis of censored and truncated data: estimating reporting-delay distributions and AIDS incidence from surveillance data. , 1994, Biometrics.

[16]  C. McCulloch Maximum Likelihood Variance Components Estimation for Binary Data , 1994 .

[17]  Mark R. Segal,et al.  Variances for Maximum Penalized Likelihood Estimates Obtained via the EM Algorithm , 1994 .

[18]  D. Rubin,et al.  Handling “Don't Know” Survey Responses: The Case of the Slovenian Plebiscite , 1995 .

[19]  Dong K. Kim,et al.  The Restricted EM Algorithm for Maximum Likelihood Estimation Under Linear Restrictions on the Parameters , 1995 .

[20]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[21]  D. Rubin,et al.  A method for calibrating false-match rates in record linkage , 1995 .

[22]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[23]  R. Jennrich,et al.  Acceleration of the EM Algorithm by using Quasi‐Newton Methods , 1997 .

[24]  P. Bentler,et al.  ML Estimation of Mean and Covariance Structures with Missing Data Using Complete Data Routines , 1999 .