Treatment of nonignorable missing data when modeling unobserved heterogeneity with finite mixture models

Multiple imputation has become a widely accepted technique to deal with the problem of incomplete data. Typically, imputation of missing values and the statistical analysis are performed separately. Therefore, the imputation model has to be consistent with the analysis model. If the data are analyzed with a mixture model, the parameter estimates are usually obtained iteratively. Thus, if the data are missing not at random, parameter estimation and treatment of missingness should be combined. We solve both problems by simultaneously imputing values using the data augmentation method and estimating parameters using the EM algorithm. This iterative procedure ensures that the missing values are properly imputed given the current parameter estimates. Properties of the parameter estimates were investigated in a simulation study. The results are illustrated using data from the National Health and Nutrition Examination Survey.

[1]  James R Carpenter,et al.  Sensitivity analysis after multiple imputation under missing at random: a weighting approach , 2007, Statistical methods in medical research.

[2]  V. Burt,et al.  Hypertension among adults in the United States: National Health and Nutrition Examination Survey, 2011-2012. , 2013, NCHS data brief.

[3]  J. Whitworth,et al.  2003 World Health Organization (WHO)/International Society of Hypertension (ISH) statement on management of hypertension , 2003, Journal of hypertension.

[4]  Stuart G Baker,et al.  A sensitivity analysis for nonrandomly missing categorical data arising from a national health disability survey. , 2003, Biostatistics.

[5]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[6]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[7]  Dankmar Böhning,et al.  Asymptotic properties of the EM algorithm estimate for normal mixture models with component specific variances , 2003, Comput. Stat. Data Anal..

[8]  Geert Molenberghs,et al.  Sensitivity analysis for incomplete categorical data , 2001 .

[9]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[10]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11]  Dankmar Böhning,et al.  Computer-Assisted Analysis of Mixtures and Applications , 2000, Technometrics.

[12]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[13]  G. Molenberghs,et al.  A Latent‐Class Mixture Model for Incomplete Longitudinal Gaussian Data , 2008, Biometrics.

[14]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .