Effects of Missing Data Methods in Structural Equation Modeling With Nonnormal Longitudinal Data

The purpose of this study is to investigate the effects of missing data techniques in longitudinal studies under diverse conditions. A Monte Carlo simulation examined the performance of 3 missing data methods in latent growth modeling: listwise deletion (LD), maximum likelihood estimation using the expectation and maximization algorithm with a nonnormality correction (robust ML), and the pairwise asymptotically distribution-free method (pairwise ADF). The effects of 3 independent variables (sample size, missing data mechanism, and distribution shape) were investigated on convergence rate, parameter and standard error estimation, and model fit. The results favored robust ML over LD and pairwise ADF in almost all respects. The exceptions included convergence rates under the most severe nonnormality in the missing not at random (MNAR) condition and recovery of standard error estimates across sample sizes. The results also indicate that nonnormality, small sample size, MNAR, and multicollinearity might adversely affect convergence rate and the validity of statistical inferences concerning parameter estimates and model fit statistics.

[1]  Patricia B. Elmore,et al.  The Effect of Multicollinearity and the Violation of the Assumption of Normality on the Testing of Hypotheses in Regression Analysis. , 1975 .

[2]  N M Laird,et al.  Missing data in longitudinal studies. , 1988, Statistics in medicine.

[3]  T. Dijkstra,et al.  Least-squares theory based on general distributional assumptions with an application to the incomplete observations problem , 1985 .

[4]  James C. Anderson,et al.  The effect of sampling error on convergence, improper solutions, and goodness-of-fit indices for maximum likelihood confirmatory factor analysis , 1984 .

[5]  Craig K. Enders,et al.  The impact of nonnormality on full information maximum-likelihood estimation for structural equation models with missing data. , 2001, Psychological methods.

[6]  Craig K. Enders,et al.  A Primer on Maximum Likelihood Algorithms Available for Use With Missing Data , 2001 .

[7]  A. Boomsma Nonconvergence, improper solutions, and starting values in lisrel maximum likelihood estimation , 1985 .

[8]  Jürgen Baumert,et al.  Modeling longitudinal and multilevel data: Practical issues, applied approaches, and specific examples. , 2000 .

[9]  J. Long,et al.  The impact of service characteristics on functional outcomes from community support programs for persons with schizophrenia: a growth curve analysis. , 1997, Journal of Consulting and Clinical Psychology.

[10]  H. Stern,et al.  The use of multiple imputation for the analysis of missing data. , 2001, Psychological methods.

[11]  A. Rotnitzky,et al.  A note on the bias of estimators with missing data. , 1994, Biometrics.

[12]  James L. Arbuckle,et al.  Full Information Estimation in the Presence of Incomplete Data , 1996 .

[13]  A. Shapiro,et al.  Robustness of normal theory methods in the analysis of linear latent variate models. , 1988 .

[14]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[15]  Craig K. Enders,et al.  The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models , 2001 .

[16]  Y Kano,et al.  Can test statistics in covariance structure analysis be trusted? , 1992, Psychological bulletin.

[17]  K. Jöreskog A general approach to confirmatory maximum likelihood factor analysis , 1969 .

[18]  P. Roth MISSING DATA: A CONCEPTUAL REVIEW FOR APPLIED PSYCHOLOGISTS , 1994 .

[19]  T. Micceri The unicorn, the normal curve, and other improbable creatures. , 1989 .

[20]  Rex B. Kline,et al.  Principles and Practice of Structural Equation Modeling , 1998 .

[21]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised Edition) , 1999 .

[22]  Jae-On Kim,et al.  The Treatment of Missing Data in Multivariate Analysis , 1977 .

[23]  A. Satorra,et al.  Scaled test statistics and robust standard errors for non-normal data in covariance structure analysis: a Monte Carlo study. , 1991, The British journal of mathematical and statistical psychology.

[24]  C. D. Vale,et al.  Simulating multivariate nonnormal distributions , 1983 .

[25]  Naresh K. Malhotra,et al.  Analyzing Marketing Research Data with Incomplete Information on the Dependent Variable , 1987 .

[26]  R. R. Hocking,et al.  ESTIMATION OF PARAMETERS WITH INCOMPLETE DATA. , 1969 .

[27]  Kenneth A Bollen,et al.  The role of coding time in estimating and interpreting growth curve models. , 2004, Psychological methods.

[28]  Jan Kmenta,et al.  Elements of econometrics , 1988 .

[29]  George L. Edgett Multiple Regression with Missing Observations Among the Independent Variables , 1956 .

[30]  J. Schafer,et al.  A comparison of inclusive and restrictive strategies in modern missing data procedures. , 2001, Psychological methods.

[31]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[32]  J. Baumert,et al.  Longitudinal and multi-group modeling with missing data , 2022 .

[33]  Fuzhong Li,et al.  A comparison of model‐ and multiple imputation‐based approaches to longitudinal analyses with partial missingness , 1998 .

[34]  Roger L. Brown Efficacy of the indirect approach for estimating structural equation models with missing data: A comparison of five methods , 1994 .

[35]  Peter M. Bentler,et al.  Treatments of Missing Data: A Monte Carlo Comparison of RBHDI, Iterative Stochastic Regression Imputation, and Expectation-Maximization , 2000 .

[36]  Peter M. Bentler,et al.  A Comparison of Maximum-Likelihood and Asymptotically Distribution-Free Methods of Treating Incomplete Nonnormal Data , 2003 .

[37]  William Meredith,et al.  Latent curve analysis , 1990 .

[38]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 2019, Wiley Series in Probability and Statistics.

[39]  Robert C. MacCallum,et al.  SPECIFICATION SEARCHES IN COVARIANCE STRUCTURE MODELING , 1986 .

[40]  Angela L. Cool A Review of Methods for Dealing with Missing Data. , 2000 .

[41]  K. Yuan,et al.  5. Three Likelihood-Based Methods for Mean and Covariance Structure Analysis with Nonnormal Missing Data , 2000 .

[42]  R. Little Models for Nonresponse in Sample Surveys , 1982 .

[43]  A. Bryk,et al.  Early vocabulary growth: Relation to language input and gender. , 1991 .

[44]  Ana Ivelisse Avilés,et al.  Linear Mixed Models for Longitudinal Data , 2001, Technometrics.

[45]  Tron Foss,et al.  The Performance of ML, GLS, and WLS Estimation in Structural Equation Modeling Under Conditions of Misspecification and Nonnormality , 2000 .

[46]  Victoria Savalei,et al.  A Statistically Justified Pairwise ML Method for Incomplete Nonnormal Data: A Comparison With Direct ML and Pairwise ADF , 2005 .

[47]  V. Willson,et al.  Effects of Nonnormal Data on Parameter Estimates and Fit Indices for a Model with Latent and Manifest Variables: An Empirical Study. , 1996 .

[48]  Joseph A. Cote,et al.  Multicollinearity and Measurement Error in Structural Equation Models: Implications for Theory Testing , 2004 .

[49]  T. W. Anderson Maximum Likelihood Estimates for a Multivariate Normal Distribution when Some Observations are Missing , 1957 .

[50]  S. West,et al.  The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. , 1996 .

[51]  Bengt Muthén,et al.  On structural equation modeling with data that are not missing completely at random , 1987 .

[52]  M. Browne,et al.  Alternative Ways of Assessing Model Fit , 1992 .

[53]  Karen E. Smith,et al.  Does the Content of Mothers' Verbal Stimulation Explain Differences in Children's Development of Verbal and Nonverbal Cognitive Skills? , 2000 .