Structural Equation Modeling With Many Variables: A Systematic Review of Issues and Developments

Survey data in social, behavioral, and health sciences often contain many variables (p). Structural equation modeling (SEM) is commonly used to analyze such data. With a sufficient number of participants (N), SEM enables researchers to easily set up and reliably test hypothetical relationships among theoretical constructs as well as those between the constructs and their observed indicators. However, SEM analyses with small N or large p have been shown to be problematic. This article reviews issues and solutions for SEM with small N, especially when p is large. The topics addressed include methods for parameter estimation, test statistics for overall model evaluation, and reliable standard errors for evaluating the significance of parameter estimates. Previous recommendations on required sample size N are also examined together with more recent developments. In particular, the requirement for N with conventional methods can be a lot more than expected, whereas new advances and developments can reduce the requirement for N substantially. The issues and developments for SEM with many variables described in this article not only let applied researchers be aware of the cutting edge methodology for SEM with big data as characterized by a large p but also highlight the challenges that methodologists need to face in further investigation.

[1]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[2]  Juan Han,et al.  Moderating and mediating effects of resilience between childhood trauma and depressive symptoms in Chinese children. , 2017, Journal of affective disorders.

[3]  Ke-Hai Yuan,et al.  Structural equation modeling with near singular covariance matrices , 2008, Comput. Stat. Data Anal..

[4]  Katerina M. Marcoulides,et al.  New Ways to Evaluate Goodness of Fit: A Note on Using Equivalence Testing to Assess Structural Equation Models , 2017 .

[5]  H. White Consequences and Detection of Misspecified Nonlinear Regression Models , 1981 .

[6]  K. Yuan,et al.  The Performance of Ten Modified Rescaled Statistics as the Number of Variables Increases , 2018 .

[7]  J. Davidson,et al.  Development of a new resilience scale: The Connor‐Davidson Resilience Scale (CD‐RISC) , 2003, Depression and anxiety.

[8]  Morten Moshagen,et al.  The Model Size Effect in SEM: Inflated Goodness-of-Fit Statistics Are Due to the Size of the Covariance Matrix , 2012 .

[9]  P. Bentler,et al.  Behavior of Asymptotically Distribution Free Test Statistics in Covariance Versus Correlation Structure Analysis , 2015 .

[10]  Willem E. Saris,et al.  The Detection and Correction of Specification Errors in Structural Equation Models , 1987 .

[11]  Sarah Depaoli,et al.  The Impact of Inaccurate “Informative” Priors for Growth Parameters in Bayesian Growth Mixture Modeling , 2014 .

[12]  Y Kano,et al.  Can test statistics in covariance structure analysis be trusted? , 1992, Psychological bulletin.

[13]  R. Hanka,et al.  The scientific use of factor analysis: Raymond B. Cattell Plenum Press, £20.48 , 1981 .

[14]  D. Lawley A GENERAL METHOD FOR APPROXIMATING TO THE DISTRIBUTION OF LIKELIHOOD RATIO CRITERIA , 1956 .

[15]  P. Bentler,et al.  Comparative fit indexes in structural models. , 1990, Psychological bulletin.

[16]  Anne Boomsma,et al.  Small-Sample Robust Estimators of Noncentrality-Based and Incremental Model Fit , 2009 .

[17]  Pui-Wa Lei,et al.  Evaluating estimation methods for ordinal data in structural equation modeling , 2009 .

[18]  Sarah Depaoli,et al.  Iteration of Partially Specified Target Matrices: Applications in Exploratory and Bayesian Confirmatory Factor Analysis , 2015, Multivariate behavioral research.

[19]  Sven Reinecke,et al.  The Model-Size Effect on Traditional and Modified Tests of Covariance Structures , 2007 .

[20]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[21]  H. White,et al.  Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties☆ , 1985 .

[22]  H. Wakaki A class of tests for a general covariance structure , 1990 .

[23]  G. Box,et al.  A general distribution theory for a class of likelihood criteria. , 1949, Biometrika.

[24]  K. Yuan,et al.  Structural Equation Modeling With Unknown Population Distributions: Ridge Generalized Least Squares , 2016 .

[25]  J. H. Steiger Statistically based tests for the number of common factors , 1980 .

[26]  Sik-Yum Lee,et al.  Structural equation modelling: A Bayesian approach. , 2007 .

[27]  I. Yoo,et al.  [Relationship between depression and resilience among children with nephrotic syndrome]. , 2004, Taehan Kanho Hakhoe chi.

[28]  A. Satorra,et al.  Corrections to test statistics and standard errors in covariance structure analysis. , 1994 .

[29]  Rachel T. Fouladi,et al.  Performance of Modified Test Statistics in Covariance and Correlation Structure Analysis Under Conditions of Multivariate Nonnormality , 2000 .

[30]  V. Savalei,et al.  When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. , 2012, Psychological methods.

[31]  Peter M. Bentler,et al.  Improving parameter tests in covariance structure analysis , 1997 .

[32]  M. Appelbaum,et al.  Psychometric methods. , 1989, Annual review of psychology.

[33]  K. Yuan,et al.  Empirically Corrected Rescaled Statistics for SEM with Small N and Large p , 2017, Multivariate behavioral research.

[34]  J. S. Long,et al.  Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model , 2000 .

[35]  G Kishi,et al.  Reliability and Validity , 1999 .

[36]  B. L. Welch THE SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO MEANS WHEN THE POPULATION VARIANCES ARE UNEQUAL , 1938 .

[37]  J. C. Gower,et al.  Factor Analysis as a Statistical Method. 2nd ed. , 1972 .

[38]  Peter M. Bentler,et al.  EQS : structural equations program manual , 1989 .

[39]  P M Bentler,et al.  Robust transformation with applications to structural equation modelling. , 2000, The British journal of mathematical and statistical psychology.

[40]  K. Dobson,et al.  Childhood adversity and adult depression: The protective role of psychological resilience. , 2017, Child abuse & neglect.

[41]  Wei Zhang,et al.  Determinants of Standard Errors of MLEs in Confirmatory Factor Analysis , 2010 .

[42]  R. Terry,et al.  Revisiting the Model Size Effect in Structural Equation Modeling , 2018 .

[43]  James C. Anderson,et al.  The Effects of Sampling Error and Model Characteristics on Parameter Estimation for Maximum Likelihood Confirmatory Factor Analysis. , 1985, Multivariate behavioral research.

[44]  Effects of employing ridge regression in structural equation models , 1997 .

[45]  K. Yuan,et al.  5. Three Likelihood-Based Methods for Mean and Covariance Structure Analysis with Nonnormal Missing Data , 2000 .

[46]  R. van de Schoot,et al.  Analyzing small data sets using Bayesian estimation: the case of posttraumatic stress symptoms following mechanical ventilation in burn survivors , 2015, European journal of psychotraumatology.

[47]  Dennis L. Jackson Sample Size and Number of Parameter Estimates in Maximum Likelihood Confirmatory Factor Analysis: A Monte Carlo Investigation , 2001 .

[48]  M. Bartlett Properties of Sufficiency and Statistical Tests , 1992 .

[49]  P M Bentler,et al.  Normal theory based test statistics in structural equation modelling. , 1998, The British journal of mathematical and statistical psychology.

[50]  Scott A Baldwin,et al.  Bayesian methods for the analysis of small sample multilevel data with a complex variance structure. , 2013, Psychological methods.

[51]  K. Yuan,et al.  Robust Methods for Moderation Analysis with a Two-Level Regression Model , 2016, Multivariate behavioral research.

[52]  A. Boomsma Nonconvergence, improper solutions, and starting values in lisrel maximum likelihood estimation , 1985 .

[53]  K. Yuan,et al.  Mean and Mean-and-Variance Corrections With Big Data , 2018 .

[54]  G. Hancock,et al.  Evaluating Small Sample Approaches for Model Test Statistics in Structural Equation Modeling , 2004 .

[55]  Paul Kline,et al.  The observation to variable ratio in factor analysis. , 1981 .

[56]  Daniel McNeish,et al.  Using Data-Dependent Priors to Mitigate Small Sample Bias in Latent Growth Models , 2016 .

[57]  Ke-Hai Yuan,et al.  Robust Structural Equation Modeling with Missing Data and Auxiliary Variables , 2012, Psychometrika.

[58]  A. Scott,et al.  On Chi-Squared Tests for Multiway Contingency Tables with Cell Proportions Estimated from Survey Data , 1984 .

[59]  Albert Maydeu-Olivares Assessing the Size of Model Misfit in Structural Equation Models , 2017, Psychometrika.

[60]  Victoria Savalei,et al.  Small Sample Statistics for Incomplete Nonnormal Data: Extensions of Complete Data Formulae and a Monte Carlo Comparison , 2010 .

[61]  John J. McArdle,et al.  Regularized Structural Equation Modeling , 2015, Multivariate behavioral research.

[62]  B. Everitt Multivariate Analysis: The Need for Data, and other Problems , 1975, British Journal of Psychiatry.

[63]  Karl G. Jöreskog,et al.  Lisrel 8: User's Reference Guide , 1997 .

[64]  Cheng-Hsien Li The performance of ML, DWLS, and ULS estimation with robust corrections in structural equation models with ordinal variables. , 2016, Psychological methods.

[65]  Kristopher J Preacher,et al.  Exploratory Factor Analysis in Behavior Genetics Research: Factor Recovery with Small Sample Sizes , 2002, Behavior genetics.

[66]  P. Wieringa,et al.  Exploratory Factor Analysis With Small Sample Sizes , 2009, Multivariate behavioral research.

[67]  Jason E. Neufeld,et al.  Dispositional forgiveness of self, others, and situations. , 2005, Journal of personality.

[68]  T. Micceri The unicorn, the normal curve, and other improbable creatures. , 1989 .

[69]  Daniel McNeish,et al.  On Using Bayesian Methods to Address Small Sample Problems , 2016 .

[70]  Ke-Hai Yuan,et al.  Two simple approximations to the distributions of quadratic forms. , 2010, The British journal of mathematical and statistical psychology.

[71]  K. Yuan,et al.  Improving the convergence rate and speed of Fisher-scoring algorithm: ridge and anti-ridge methods in structural equation modeling , 2017 .

[72]  Robert I. Jennrich,et al.  A study of algorithms for covariance structure analysis with specific comparisons using factor analysis , 1979 .

[73]  R. MacCallum,et al.  Sample size in factor analysis. , 1999 .

[74]  K. Yuan,et al.  Empirical Correction to the Likelihood Ratio Statistic for Structural Equation Modeling with Many Variables , 2013, Psychometrika.

[75]  Ke-Hai Yuan,et al.  8. Outliers, Leverage Observations, and Influential Cases in Factor Analysis: Using Robust Procedures to Minimize Their Effect , 2008 .

[76]  K. Yuan,et al.  Bootstrap approach to inference and power analysis based on three test statistics for covariance structure models. , 2003, The British journal of mathematical and statistical psychology.

[77]  A. J. Swain,et al.  Analysis of parametric structures for variance matrices / by Anthony J. Swain , 1975 .

[78]  P. Kline Psychometrics and psychology , 1979 .

[79]  John R. Nesselroade,et al.  Bayesian analysis of longitudinal data using growth curve models , 2007 .

[80]  M. Bartlett A Note on the Multiplying Factors for Various χ2 Approximations , 1954 .

[81]  K. Yuan,et al.  Ridge structural equation modelling with correlation matrices for ordinal and continuous data. , 2011, The British journal of mathematical and statistical psychology.

[82]  Modified Distribution-Free Goodness of Fit Test Statistic , 2015 .

[83]  K. Yuan,et al.  Structural equation modeling with heavy tailed distributions , 2004 .

[84]  Xin-Yuan Song,et al.  Evaluation of the Bayesian and Maximum Likelihood Approaches in Analyzing Structural Equation Models with Small Sample Sizes , 2004, Multivariate behavioral research.

[85]  Modified Distribution-Free Goodness-of-Fit Test Statistic , 2018, Psychometrika.

[86]  A. Basilevsky,et al.  Factor Analysis as a Statistical Method. , 1964 .

[87]  P. Bentler,et al.  The weight matrix in asymptotic distribution-free methods , 1985 .

[88]  Peter M. Bentler,et al.  Practical Issues in Structural Modeling , 1987 .

[89]  B. Price A First Course in Factor Analysis , 1993 .

[90]  Daniel J. Mundfrom,et al.  Minimum Sample Size Recommendations for Conducting Factor Analyses , 2005 .

[91]  M. Browne Asymptotically distribution-free methods for the analysis of covariance structures. , 1984, The British journal of mathematical and statistical psychology.

[92]  Ke-Hai Yuan,et al.  Mean and Covariance Structure Analysis: Theoretical and Practical Improvements , 1997 .

[93]  H. Afshar,et al.  The role of interpersonal forgiveness in resilience and severity of pain in chronic pain patients , 2016 .

[94]  R. Scheines,et al.  Bayesian estimation and testing of structural equation models , 1999 .

[95]  K. Yuan Fit Indices Versus Test Statistics , 2005, Multivariate behavioral research.

[96]  Francisco Cribari-Neto,et al.  Asymptotic inference under heteroskedasticity of unknown form , 2004, Comput. Stat. Data Anal..

[97]  Four New Corrected Statistics for SEM With Small Samples and Nonnormally Distributed Data , 2017 .

[98]  Ke-Hai Yuan,et al.  F Tests for Mean and Covariance Structure Analysis , 1999 .

[99]  M. Bartlett THE EFFECT OF STANDARDIZATION ON A χ2 APPROXIMATION IN FACTOR ANALYSIS , 1951 .

[100]  W. Arrindell,et al.  An Empirical Test of the Utility of the Observations-To-Variables Ratio in Factor and Components Analysis , 1985 .

[101]  K. Yuan,et al.  Standard errors in covariance structure models: asymptotics versus bootstrap. , 2006, The British journal of mathematical and statistical psychology.

[102]  P. Bentler,et al.  A Regularized GLS for Structural Equation Modeling , 2017 .

[103]  Hung-Hsuan Chen,et al.  A Penalized Likelihood Method for Structural Equation Modeling , 2017, Psychometrika.

[104]  Alexander Shapiro,et al.  Asymptotic distribution theory in the analysis of covariance structures , 1983 .

[105]  Yonggui Yuan,et al.  The reliability and validity of a Chinese-version Short Health Anxiety Inventory: an investigation of university students , 2015, Neuropsychiatric disease and treatment.

[106]  P. Bentler Some contributions to efficient statistics in structural models: Specification and estimation of moment structures , 1983 .

[107]  Howard B. Lee,et al.  A first course in factor analysis , 1973 .

[108]  K. Yuan,et al.  Evaluation of Test Statistics for Robust Structural Equation Modeling With Nonnormal Missing Data , 2014 .

[109]  Xin Tong,et al.  Abstract: Evaluation of Test Statistics for Robust Structural Equation Modeling With Nonnormal Missing Data , 2011, Multivariate behavioral research.

[110]  Robert Cudeck,et al.  Analysis of correlation matrices using covariance structure models. , 1989 .

[111]  K. Yuan,et al.  Structural Equation Modeling with Small Samples: Test Statistics. , 1999, Multivariate behavioral research.