5. Three Likelihood-Based Methods for Mean and Covariance Structure Analysis with Nonnormal Missing Data

Survey and longitudinal studies in the social and behavioral sciences generally contain missing data. Mean and covariance structure models play an important role in analyzing such data. Two promising methods for dealing with missing data are a direct maximum-likelihood and a two-stage approach based on the unstructured mean and covariance estimates obtained by the EM-algorithm. Typical assumptions under these two methods are ignorable nonresponse and normality of data. However, data sets in social and behavioral sciences are seldom normal, and experience with these procedures indicates that normal theory based methods for nonnormal data very often lead to incorrect model evaluations. By dropping the normal distribution assumption, we develop more accurate procedures for model inference. Based on the theory of generalized estimating equations, a way to obtain consistent standard errors of the two-stage estimates is given. The asymptotic efficiencies of different estimators are compared under various assumptions. We also propose a minimum chi-square approach and show that the estimator obtained by this approach is asymptotically at least as efficient as the two likelihood-based estimators for either normal or nonnormal data. The major contribution of this paper is that for each estimator, we give a test statistic whose asymptotic distribution is chisquare as long as the underlying sampling distribution enjoys finite fourth-order moments. We also give a characterization for each of the two likelihood ratio test statistics when the underlying distribution is nonnormal. Modifications to the likelihood ratio statistics are also given. Our working assumption is that the missing data mechanism is missing completely at random. Examples and Monte Carlo studies indicate that, for commonly encountered nonnormal distributions, the procedures developed in this paper are quite reliable even for samples with missing data that are missing at random.

[1]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[2]  K. Yuan,et al.  Structural Equation Modeling with Small Samples: Test Statistics. , 1999, Multivariate behavioral research.

[3]  P. Bentler,et al.  ML Estimation of Mean and Covariance Structures with Missing Data Using Complete Data Routines , 1999 .

[4]  Ke-Hai Yuan,et al.  ON NORMAL THEORY AND ASSOCIATED TEST STATISTICS IN COVARIANCE STRUCTURE ANALYSIS UNDER TWO CLASSES OF NONNORMAL DISTRIBUTIONS , 1999 .

[5]  P M Bentler,et al.  Normal theory based test statistics in structural equation modelling. , 1998, The British journal of mathematical and statistical psychology.

[6]  Rex B. Kline,et al.  Principles and Practice of Structural Equation Modeling , 1998 .

[7]  Ke-Hai Yuan,et al.  Asymptotics of Estimating Equations under Natural Conditions , 1998 .

[8]  Herbert W. Marsh,et al.  Pairwise Deletion for Missing Data in Structural Equation Models: Nonpositive Definite Matrices, Parameter Estimates, Goodness of Fit, and Adjusted Sample Sizes. , 1998 .

[9]  Peter M. Bentler,et al.  Improving parameter tests in covariance structure analysis , 1997 .

[10]  Ke-Hai Yuan,et al.  Mean and Covariance Structure Analysis: Theoretical and Practical Improvements , 1997 .

[11]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[12]  S. West,et al.  The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. , 1996 .

[13]  James L. Arbuckle,et al.  Full Information Estimation in the Presence of Incomplete Data , 1996 .

[14]  P. Bentler,et al.  Bootstrapping Techniques in Analysis of Mean and Covariance Structures , 1996 .

[15]  E. Ziegel,et al.  Basic Principles of Structural Equation Modelling , 1996 .

[16]  T. Ferguson A Course in Large Sample Theory , 1996 .

[17]  G. Arminger,et al.  Specification and Estimation of Mean- and Covariance-Structure Models , 1995 .

[18]  Peter M. BentlerOctober Mean and Covariance Structure Analysis with Missing Data , 1995 .

[19]  A. Rotnitzky,et al.  A note on the bias of estimators with missing data. , 1994, Biometrics.

[20]  Xiao-Li Meng,et al.  Multiple-Imputation Inferences with Uncongenial Sources of Input , 1994 .

[21]  M. Rovine,et al.  Latent variables models and missing data analysis. , 1994 .

[22]  Roger L. Brown Efficacy of the indirect approach for estimating structural equation models with missing data: A comparison of five methods , 1994 .

[23]  Ingram Olkin,et al.  MULTIVARIATE NON-NORMAL DISTRIBUTIONS AND MODELS OF DEPENDENCY , 1994 .

[24]  R. Stine,et al.  Bootstrapping Goodness-of-Fit Measures in Structural Equation Models , 1992 .

[25]  Y Kano,et al.  Can test statistics in covariance structure analysis be trusted? , 1992, Psychological bulletin.

[26]  P. Bentler,et al.  Robustness of normal theory statistics in structural equation models , 1991 .

[27]  Yutaka Tanaka,et al.  Influence in covariance structure analysis : with an application to confirmatory factor analysis , 1991 .

[28]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 1991 .

[29]  Albert Satorra,et al.  Model Conditions for Asymptotic Robustness in the Analysis of Linear Relations , 1990 .

[30]  T. W. Anderson,et al.  Asymptotic Chi-Square Tests for a Large Class of Factor Analysis Models , 1990 .

[31]  Michael E. Sobel,et al.  Pseudo-Maximum Likelihood Estimation of Mean and Covariance Structures with Missing Data , 1990 .

[32]  Karl G. Jöreskog,et al.  Lisrel 8: User's Reference Guide , 1997 .

[33]  Jeremy MG Taylor,et al.  Robust Statistical Modeling Using the t Distribution , 1989 .

[34]  S. Kotz,et al.  Symmetric Multivariate and Related Distributions , 1989 .

[35]  Ronald Schoenberg,et al.  Pseudo maximum likelihood estimation and a test for misspecification in mean and covariance structure models , 1989 .

[36]  T. Micceri The unicorn, the normal curve, and other improbable creatures. , 1989 .

[37]  A. Shapiro,et al.  Robustness of normal theory methods in the analysis of linear latent variate models. , 1988 .

[38]  T. W. Anderson,et al.  The asymptotic normal distribution of estimators in factor analysis under general conditions , 1988 .

[39]  R. Little Robust Estimation of the Mean and Covariance Matrix from Data with Missing Values , 1988 .

[40]  N M Laird,et al.  Missing data in longitudinal studies. , 1988, Statistics in medicine.

[41]  A. Shapiro,et al.  Analysis of Covariance Structures under Elliptical Distributions , 1987 .

[42]  Bengt Muthén,et al.  On structural equation modeling with data that are not missing completely at random , 1987 .

[43]  M. Browne Robustness of statistical inference in factor analysis and related models , 1987 .

[44]  P. Allison Estimation of Linear Models with Incomplete Data , 1987 .

[45]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[46]  Sik-Yum Lee,et al.  Estimation for structural equation models with missing data , 1986 .

[47]  T. K. Dijkstra Latent variables in linear stochastic models : refletions on "maximum likelihood" and "partial least squares" methods , 1985 .

[48]  C. Gouriéroux,et al.  PSEUDO MAXIMUM LIKELIHOOD METHODS: THEORY , 1984 .

[49]  M. Browne Asymptotically distribution-free methods for the analysis of covariance structures. , 1984, The British journal of mathematical and statistical psychology.

[50]  P. Bentler Some contributions to efficient statistics in structural models: Specification and estimation of moment structures , 1983 .

[51]  C. Brown,et al.  Asymptotic comparison of missing data procedures for estimating factor loadings , 1983 .

[52]  Alexander Shapiro,et al.  Asymptotic distribution theory in the analysis of covariance structures , 1983 .

[53]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[54]  Taeke Klaas Dijkstra Latent variables in linear stochastic models , 1981 .

[55]  R. Muirhead,et al.  Asymptotic distributions in canonical correlation analysis and other multivariate procedures for nonnormal populations , 1980 .

[56]  Carl T. Finkbeiner Estimation for the multiple factor model when data are missing , 1979 .

[57]  C. G. Khatri,et al.  A note on a manova model applied to problems in growth curve , 1966 .

[58]  T. W. Anderson Maximum Likelihood Estimates for a Multivariate Normal Distribution when Some Observations are Missing , 1957 .