Measurement Invariance Testing with Many Groups: A Comparison of Five Approaches

With the increasing use of international survey data especially in cross-cultural and multinational studies, establishing measurement invariance (MI) across a large number of groups in a study is essential. Testing MI over many groups is methodologically challenging, however. We identified 5 methods for MI testing across many groups (multiple group confirmatory factor analysis, multilevel confirmatory factor analysis, multilevel factor mixture modeling, Bayesian approximate MI testing, and alignment optimization) and explicated the similarities and differences of these approaches in terms of their conceptual models and statistical procedures. A Monte Carlo study was conducted to investigate the efficacy of the 5 methods in detecting measurement noninvariance across many groups using various fit criteria. Generally, the 5 methods showed reasonable performance in identifying the level of invariance if an appropriate fit criterion was used (e.g., Bayesian information criteron with multilevel factor mixture modeling). Finally, general guidelines in selecting an appropriate method are provided.

[1]  Herbert W. Marsh,et al.  Motivation and Engagement in Science around the Globe: Testing Measurement Invariance with Multigroup Structural Equation Models across 57 Countries Using PISA 2006 , 2013 .

[2]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[3]  Seang-Hwane Joo,et al.  Measurement Invariance Testing Across Between-Level Latent Classes Using Multilevel Factor Mixture Modeling , 2016 .

[4]  Bengt Muthén,et al.  New Methods for the Study of Measurement Invariance with Many Groups , 2013 .

[5]  S. Sclove Application of model-selection criteria to some problems in multivariate analysis , 1987 .

[6]  P. Deb Finite Mixture Models , 2008 .

[7]  Bengt Muthén,et al.  Bayesian Analysis Using Mplus , 2010 .

[8]  R. Osborne,et al.  Measuring health literacy in community agencies: a Bayesian study of the factor structure and measurement invariance of the health literacy questionnaire (HLQ) , 2016, BMC Health Services Research.

[9]  F. Chen Sensitivity of Goodness of Fit Indexes to Lack of Measurement Invariance , 2007 .

[10]  Fritz Drasgow,et al.  Detecting differential item functioning with confirmatory factor analysis and item response theory: toward a unified strategy. , 2006, The Journal of applied psychology.

[11]  P. Schmidt,et al.  Measurement Equivalence in Cross-National Research , 2014 .

[12]  Suzanne Jak,et al.  Measurement Bias in Multilevel Data , 2014 .

[13]  J. Hox,et al.  Sufficient Sample Sizes for Multilevel Modeling , 2005 .

[14]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[15]  Eldad Davidov,et al.  The comparability of the universalism value over time and across countries in the European Social Survey: exact vs. approximate measurement invariance , 2015, Front. Psychol..

[16]  David B. Dunson,et al.  Bayesian Structural Equation Modeling , 2007 .

[17]  Holmes Finch,et al.  The MIMIC Model as a Method for Detecting DIF: Comparison With Mantel-Haenszel, SIBTEST, and the IRT Likelihood Ratio , 2005 .

[18]  W. Meredith Measurement invariance, factor analysis and factorial invariance , 1993 .

[19]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[20]  Gregory R. Hancock,et al.  Type I Error and Power of Latent Mean Methods and MANOVA in Factorially Invariant and Noninvariant Latent Variable Systems , 2000 .

[21]  Stefany Coxe,et al.  Statistical Power to Detect the Correct Number of Classes in Latent Profile Analysis , 2013, Structural equation modeling : a multidisciplinary journal.

[22]  Fritz Drasgow,et al.  Multilevel Mixed-Measurement IRT Analysis: An Explication and Application to Self-Reported Emotions Across the World , 2011 .

[23]  K. Kubacka,et al.  Data comparability in the Teaching and Learning International Survey (TALIS) 2008 and 2013 , 2015 .

[24]  Peter M. Bentler,et al.  Ensuring Positiveness of the Scaled Difference Chi-square Test Statistic , 2008, Psychometrika.

[25]  Bengt Muthén,et al.  IRT studies of many groups: the alignment method , 2014, Front. Psychol..

[26]  A. Satorra,et al.  Corrections to test statistics and standard errors in covariance structure analysis. , 1994 .

[27]  Andrew Gelman,et al.  Bayesian Measures of Explained Variance and Pooling in Multilevel (Hierarchical) Models , 2006, Technometrics.

[28]  Carol M. Woods Please Scroll down for Article Multivariate Behavioral Research Evaluation of Mimic-model Methods for Dif Testing with Comparison to Two- Group Analysis , 2022 .

[29]  Lars Tummers,et al.  Facing off with Scylla and Charybdis: a comparison of scalar, partial, and the novel possibility of approximate measurement invariance , 2013, Front. Psychol..

[30]  Roel Bosker,et al.  Multilevel analysis : an introduction to basic and advanced multilevel modeling , 1999 .

[31]  H. Akaike A new look at the statistical model identification , 1974 .

[32]  D. Rubin,et al.  Testing the number of components in a normal mixture , 2001 .

[33]  B. Meuleman,et al.  Measuring Attitudes toward Immigration in Europe: The Cross-Cultural Validity of the ESS Immigration Scales , 2012 .

[34]  Deana Desa,et al.  Evaluating Measurement Invariance of TALIS 2013 Complex Scales: Comparison between Continuous and Categorical Multiple-Group Confirmatory Factor Analyses , 2014 .

[35]  M. Stephens Dealing with label switching in mixture models , 2000 .

[36]  Jan de Leeuw,et al.  Introducing Multilevel Modeling , 1998 .

[37]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[38]  Tihomir Asparouhov,et al.  Bayesian Analysis of Latent Variable Models using Mplus , 2010 .

[39]  P. Bentler,et al.  Significance Tests and Goodness of Fit in the Analysis of Covariance Structures , 1980 .

[40]  Robert I. Jennrich,et al.  Rotation to Simple Loadings Using Component Loss Functions: The Oblique Case , 2006 .

[41]  B. Muthén,et al.  Investigating population heterogeneity with factor mixture models. , 2005, Psychological methods.

[42]  Bengt Muthén,et al.  Multiple-Group Factor Analysis Alignment , 2014 .

[43]  Jeroen K. Vermunt,et al.  Determining the Number of Components in Mixture Models for Hierarchical Data , 2008, GfKl.

[44]  Gordon W. Cheung,et al.  Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance , 2002 .

[45]  V. Willson,et al.  Measurement Invariance Across Groups in Latent Growth Modeling , 2014 .

[46]  Suzanne Jak,et al.  A Test for Cluster Bias: Detecting Violations of Measurement Invariance Across Clusters in Multilevel Data , 2013 .

[47]  Bengt Muthén,et al.  Recent Methods for the Study of Measurement Invariance With Many Groups , 2018 .

[48]  D. Kaplan,et al.  Bayesian Statistics for the Social Sciences , 2014 .

[49]  S. Schwartz,et al.  Testing the discriminant validity of Schwartz’ Portrait Value Questionnaire items – A replication and extension of Knoppen and Saris (2009) , 2012 .

[50]  A. Goldberger,et al.  Estimation of a Model with Multiple Indicators and Multiple Causes of a Single Latent Variable , 1975 .

[51]  Eldad Davidov,et al.  Using a Multilevel Structural Equation Modeling Approach to Explain Cross-Cultural Measurement Noninvariance , 2012 .

[52]  Leslie Rutkowski,et al.  Assessing the Hypothesis of Measurement Invariance in the Context of Large-Scale International Surveys , 2014 .

[53]  Hal S. Stern,et al.  Posterior Predictive Assessment of Item Response Theory Models , 2006 .

[54]  P. Bentler,et al.  Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives , 1999 .

[55]  B. Muthén,et al.  Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study , 2007 .

[56]  E. S. Kim,et al.  Within-Level Group Factorial Invariance With Multilevel Data: Multilevel Factor Mixture and Multilevel MIMIC Models , 2015 .

[57]  B. Muthén,et al.  Computing the Strictly Positive Satorra-Bentler Chi-Square Test in Mplus , 2013 .

[58]  Phillip W. Braddy,et al.  Power and sensitivity of alternative fit indices in tests of measurement invariance. , 2008, The Journal of applied psychology.

[59]  Ronald Christensen,et al.  Bayesian Ideas and Data Analysis: An Introduction for Scientists and Statisticians , 2010 .

[60]  Bengt Muthén,et al.  Bayesian structural equation modeling: a more flexible representation of substantive theory. , 2012, Psychological methods.