Measurement Invariance Conventions and Reporting: The State of the Art and Future Directions for Psychological Research.

Measurement invariance assesses the psychometric equivalence of a construct across groups or across time. Measurement noninvariance suggests that a construct has a different structure or meaning to different groups or on different measurement occasions in the same group, and so the construct cannot be meaningfully tested or construed across groups or across time. Hence, prior to testing mean differences across groups or measurement occasions (e.g., boys and girls, pretest and posttest), or differential relations of the construct across groups, it is essential to assess the invariance of the construct. Conventions and reporting on measurement invariance are still in flux, and researchers are often left with limited understanding and inconsistent advice. Measurement invariance is tested and established in different steps. This report surveys the state of measurement invariance testing and reporting, and details the results of a literature review of studies that tested invariance. Most tests of measurement invariance include configural, metric, and scalar steps; a residual invariance step is reported for fewer tests. Alternative fit indices (AFIs) are reported as model fit criteria for the vast majority of tests; χ2 is reported as the single index in a minority of invariance tests. Reporting AFIs is associated with higher levels of achieved invariance. Partial invariance is reported for about one-third of tests. In general, sample size, number of groups compared, and model size are unrelated to the level of invariance achieved. Implications for the future of measurement invariance testing, reporting, and best practices are discussed.

[1]  Gordon W. Cheung,et al.  Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance , 2002 .

[2]  B. French,et al.  Multigroup Confirmatory Factor Analysis: Locating the Invariant Referent Sets , 2008 .

[3]  E. Ferrer,et al.  Factorial Invariance within Longitudinal Structural Equation Models: Measuring the Same Construct across Time. , 2010, Child development perspectives.

[4]  Robert E. Ployhart,et al.  Applications of Mean and Covariance Structure Analysis: Integrating Correlational and Experimental Approaches , 2004 .

[5]  E. S. Kim,et al.  A comparison of sequential and nonsequential specification searches in testing factorial invariance , 2013, Behavior Research Methods.

[6]  Fritz Drasgow,et al.  Effect size indices for analyses of measurement equivalence: understanding the practical importance of differences between groups. , 2011, The Journal of applied psychology.

[7]  Jennifer L. Glanville,et al.  The Measurement of School Engagement , 2007 .

[8]  Myeongsun Yoon,et al.  Comparisons of Three Empirical Methods for Partial Factorial Invariance: Forward, Backward, and Factor-Ratio Tests , 2016 .

[9]  R. Osborne,et al.  Tests of measurement invariance failed to support the application of the "then-test". , 2009, Journal of clinical epidemiology.

[10]  J. Steenkamp,et al.  Assessing Measurement Invariance in Cross-National Consumer Research , 1998 .

[11]  Gordon W. Cheung,et al.  A Direct Comparison Approach for Testing Measurement Invariance , 2012 .

[12]  Todd D. Little,et al.  On the Comparability of Constructs in Cross-Cultural Research , 2000 .

[13]  R. P. McDonald,et al.  An index of goodness-of-fit based on noncentrality , 1989 .

[14]  J. Hox,et al.  A checklist for testing measurement invariance , 2012 .

[15]  W. Meredith Measurement invariance, factor analysis and factorial invariance , 1993 .

[16]  Holger Steinmetz,et al.  Analyzing observed composite differences across groups: Is partial measurement invariance enough? , 2013 .

[17]  K. Bollen,et al.  An Empirical Evaluation of the Use of Fixed Cutoff Points in RMSEA Test Statistic in Structural Equation Models , 2008, Sociological methods & research.

[18]  R. Vandenberg,et al.  A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research , 2000 .

[19]  F. Chen Sensitivity of Goodness of Fit Indexes to Lack of Measurement Invariance , 2007 .

[20]  P. Schmidt,et al.  Measurement Equivalence in Cross-National Research , 2014 .

[21]  Anna Brown,et al.  The consequences of ignoring measurement invariance for path coefficients in structural equation models , 2014, Front. Psychol..

[22]  Gordon W. Cheung,et al.  Testing Measurement Models for Factorial Invariance: A Systematic Approach , 1998 .

[23]  B. Byrne,et al.  Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. , 1989 .

[24]  Xitao Fan,et al.  Using ΔGoodness-of-Fit Indexes in Assessing Mean Structure Invariance , 2009 .

[25]  Adam W. Meade,et al.  An Overview and Practical Guide to IRT Measurement Equivalence Analysis , 2015 .

[26]  H. Marsh,et al.  Application of confirmatory factor analysis to the study of self-concept: First- and higher order factor models and their invariance across groups. , 1985 .

[27]  Leslie Rutkowski,et al.  Assessing the Hypothesis of Measurement Invariance in the Context of Large-Scale International Surveys , 2014 .

[28]  N. Schmitt,et al.  Measurement invariance: Review of practice and implications , 2008 .

[29]  R. Rosenthal The file drawer problem and tolerance for null results , 1979 .

[30]  Sven Reinecke,et al.  The Model-Size Effect on Traditional and Modified Tests of Covariance Structures , 2007 .

[31]  Gordon W. Cheung,et al.  Cross-cultural comparisons using non-invariant measurement items , 1998 .

[32]  D. Betsy McCoach,et al.  The Performance of RMSEA in Models With Small Degrees of Freedom , 2015 .

[33]  Keith F Widaman,et al.  Confirmatory factor analysis and item response theory: two approaches for exploring measurement invariance. , 1993, Psychological bulletin.

[34]  T. Little Longitudinal Structural Equation Modeling , 2013 .

[35]  T. Little,et al.  Perspectives of fathers and mothers of children in early intervention programmes in assessing family quality of life. , 2006, Journal of intellectual disability research : JIDR.

[36]  Rex B. Kline,et al.  Principles and Practice of Structural Equation Modeling , 1998 .

[37]  Fang Fang Chen,et al.  What happens if we compare chopsticks with forks? The impact of making inappropriate comparisons in cross-cultural research. , 2008, Journal of personality and social psychology.

[38]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[39]  Jelte M. Wicherts,et al.  Modeling differentiation of cognitive abilities within the higher-order factor model using moderated factor analysis , 2010 .

[40]  R. Millsap Four Unresolved Problems in Studies of Factorial Invariance. , 2005 .

[41]  Barbara M Byrne,et al.  Measurement equivalence: a comparison of methods based on confirmatory factor analysis and item response theory. , 2002, The Journal of applied psychology.

[42]  R. Vandenberg Toward a Further Understanding of and Improvement in Measurement Invariance Methods and Procedures , 2002 .

[43]  Fritz Drasgow,et al.  Detecting differential item functioning with confirmatory factor analysis and item response theory: toward a unified strategy. , 2006, The Journal of applied psychology.

[44]  Roger E. Millsap,et al.  Detecting Violations of Factorial Invariance Using Data-Based Specification Searches: A Monte Carlo Study , 2007 .

[45]  Bart Meuleman When are item intercept differences substantively relevant in measurement invariance testing , 2012 .

[46]  T. Little Mean and Covariance Structures (MACS) Analyses of Cross-Cultural Data: Practical and Theoretical Issues. , 1997, Multivariate behavioral research.

[47]  Comparison of Multiple-Indicators, Multiple-Causes– and Item Response Theory–Based Analyses of Subgroup Differences , 2008 .

[48]  Marc H. Bornstein,et al.  Form and Function: Implications for Studies of Culture and Human Development , 1995 .

[49]  Emily C. Johnson,et al.  The Role of Referent Indicators in Tests of Measurement Invariance , 2009 .

[50]  Sehee Hong,et al.  Testing Configural, Metric, Scalar, and Latent Mean Invariance Across Genders in Sociotropy and Autonomy Using a Non-Western Sample , 2003 .

[51]  Frederick T. L. Leong,et al.  Impact of Measurement Invariance on Construct Correlations, Mean Differences, and Relations With External Correlates , 2011, Assessment.

[52]  William Meredith,et al.  Notes on factorial invariance , 1964 .

[53]  S. Reise,et al.  Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. , 1997 .

[54]  Ronald Fischer,et al.  Testing measurement invariance across groups: applications in cross-cultural research. , 2010 .

[55]  Phillip W. Braddy,et al.  Power and sensitivity of alternative fit indices in tests of measurement invariance. , 2008, The Journal of applied psychology.

[56]  W. Holmes Finch,et al.  Confirmatory Factor Analytic Procedures for the Determination of Measurement Invariance , 2006 .

[57]  R. Millsap,et al.  Evaluating the impact of partial factorial invariance on selection in two populations. , 2004, Psychological methods.

[58]  Achim Zeileis,et al.  Score-based tests of measurement invariance: use in practice , 2014, Front. Psychol..

[59]  Jacob Cohen,et al.  Factorial Invariance and Other Psychometric Characteristics of Five Opinions About Mental Illness Factors , 1963 .

[60]  P. Barrett Structural equation modelling : Adjudging model fit , 2007 .

[61]  M. Bornstein,et al.  A cross-cultural comparison of mothers' beliefs about their parenting very young children. , 2012, Infant behavior & development.

[62]  Gordon W. Cheung,et al.  Testing Factorial Invariance across Groups: A Reconceptualization and Proposed New Method , 1999 .

[63]  Daniel J Bauer,et al.  Psychometric approaches for developing commensurate measures across independent studies: traditional and new models. , 2009, Psychological methods.

[64]  Peter M. Bentler,et al.  On tests and indices for evaluating structural models , 2007 .

[65]  Adam W. Meade,et al.  A Comparison of Item Response Theory and Confirmatory Factor Analytic Methodologies for Establishing Measurement Equivalence/Invariance , 2004 .

[66]  Andrea M Hussong,et al.  Integrative data analysis: the simultaneous analysis of multiple data sets. , 2009, Psychological methods.

[67]  Daniel L. Oberski,et al.  Evaluating Sensitivity of Parameters of Interest to Measurement Invariance in Latent Variable Models , 2014, Political Analysis.