Iterative Likelihood: A Unified Inference Tool

Abstract We propose a framework for inference based on an “iterative likelihood function,” which provides a unified representation for a number of iterative approaches, including the EM algorithm and the generalized estimating equations (GEEs). The parameters are decoupled to facilitate construction of the inference vehicle, to simplify computation, or to ensure robustness to model misspecification and then recoupled to retain their original interpretations. For simplicity, throughout the paper, we will refer to the log-likelihood as the “likelihood.” We define the global, local, and stationary estimates of an iterative likelihood and, correspondingly, the global, local, and stationary attraction points of the expected iterative likelihood. Asymptotic properties of the global, local, and stationary estimates are derived under certain assumptions. An iterative likelihood is usually constructed such that the true value of the parameter is a point of attraction of the expected log-likelihood. Often, one can only verify that the true value of the parameter is a local or stationary attraction, but not a global attraction. We show that when the true value of the parameter is a global attraction, any global estimate is consistent and asymptotically normal; when the true value is a local or stationary attraction, there exists a local or stationary estimate that is consistent and asymptotically normal, with a probability tending to 1. The behavior of the estimates under a misspecified model is also discussed. Our methodology is illustrated with three examples: (i) estimation of the treatment group difference in the level of censored HIV RNA viral load from an AIDS clinical trial; (ii) analysis of the relationship between forced expiratory volume and height in girls from a longitudinal pulmonary function study; and (iii) investigation of the impact of smoking on lung cancer in the presence of DNA adducts. Two additional examples are in the supplementary materials, GEEs with missing covariates and an unweighted estimator for big data with subsampling. Supplementary files for this article are available online.

[1]  R. A. Boyles On the Convergence of the EM Algorithm , 1983 .

[2]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[3]  J. H. Schuenemeyer,et al.  Generalized Linear Models (2nd ed.) , 1992 .

[4]  Richard J. Fateman,et al.  Automatic Differentiation of Algorithms: Theory, Implementation, and Application (Andreas Griewank and George F. Corliss, eds.) , 1993, SIAM Rev..

[5]  David Wypij,et al.  Pulmonary function between 6 and 18 years of age , 1993, Pediatric pulmonology.

[6]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[7]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[8]  I. Meilijson A fast improvement to the EM algorithm on its own terms , 1989 .

[9]  D. Hand,et al.  Practical Longitudinal Data Analysis , 1996 .

[10]  Xiping Xu,et al.  A case-control study of cytochrome P450 1A1, glutathione S-transferase M1, cigarette smoking and lung cancer susceptibility (Massachusetts, United States) , 1997, Cancer Causes & Control.

[11]  Christl A. Donnelly,et al.  Review papers : Longitudinal studies with continuous responses , 1992 .

[12]  Florian Heiss,et al.  Likelihood approximation by numerical integration on sparse grids , 2008 .

[13]  T. Ferguson An Inconsistent Maximum Likelihood Estimate , 1982 .

[14]  C. Heyde,et al.  Multiple roots in general estimating equations , 1998 .

[15]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[16]  Wenjiang J. Fu,et al.  Penalized Estimating Equations , 2003, Biometrics.

[17]  Victor DeGruttola,et al.  Dual vs single protease inhibitor therapy following antiretroviral treatment failure: a randomized trial. , 2002, JAMA.

[18]  B. Lindsay Conditional score functions: Some optimality results , 1982 .

[19]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[20]  K Y Liang,et al.  Longitudinal data analysis for discrete and continuous outcomes. , 1986, Biometrics.

[21]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[22]  Margaret S. Pepe,et al.  A mean score method for missing and auxiliary covariate data in regression models , 1995 .

[23]  Donald Hedeker,et al.  Longitudinal Data Analysis , 2006 .

[24]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[25]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .