An Essay on Measurement and Factorial Invariance

Background:Analysis of subgroups such as different ethnic, language, or education groups selected from among a parent population is common in health disparities research. One goal of such analyses is to examine measurement equivalence, which includes both qualitative review of the meaning of items as well as quantitative examination of different levels of factorial invariance and differential item functioning. Objectives:The purpose of this essay is to review the definitions and assumptions associated with factorial invariance, placing this formulation in the context of bias, fairness, and equity. The connection between the concepts of factorial invariance and item bias (differential item functioning) using a variant of item response theory is discussed. The situations under which different forms of invariance (weak, strong, and strict) are required are discussed. Methods:Establishing factorial invariance involves a hierarchy of levels that include tests of weak, strong, and strict invariance. Pattern (metric or weak) factorial invariance implies that the regression slopes are invariant across groups. Pattern invariance requires only invariant factor loadings. Strong factorial invariance implies that the conditional expectation of the response, given the common and specific factors, is invariant across groups. Strong factorial invariance requires that specific factor means (represented as invariant intercepts) also be identical across groups. Strict factorial invariance implies that, in addition, the conditional variance of the response, given the common and specific factors, is invariant across groups. Strict factorial invariance requires that, in addition to equal factor loadings and intercepts, the residual (specific factor plus error variable) variances are equivalent across groups. The concept of measurement invariance that is most closely aligned to that of item response theory considers the latent variable as a common factor measured by manifest variables; the specific factors can be characterized as nuisance variables. Conclusions:Invariance of factor loadings across studied groups is required for valid comparisons of scale score or latent variable means. Strong and strict invariance may be less important in the context of basic research in which group differences in specific factors are indicative of individual differences that are important for scientific exploration. However, for most applications in which the aim is to ensure fairness and equity, strict factorial invariance is required. Health disparities research often focuses on self-reported clinical outcomes such as quality of life that are not observed directly. Latent variable models such as factor analyses are central to establishing valid assessment of such outcomes.

[1]  B. Muthén Latent variable modeling in heterogeneous populations , 1989 .

[2]  G. Lubke,et al.  Can Unequal Residual Variances Across Groups Mask Differences in Residual Means in the Common Factor Model? , 2003 .

[3]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[4]  J. Teresi,et al.  Item and Scale Differential Functioning of the Mini-Mental State Exam Assessed Using the Differential Item and Test Functioning (DFIT) Framework , 2006, Medical care.

[5]  Louis Guttman,et al.  THE DETERMINACY OF FACTOR SCORE MATRICES WITH IMPLICATIONS FOR FIVE OTHER BASIC PROBLEMS OF COMMON‐FACTOR THEORY1 , 1955 .

[6]  B. Byrne,et al.  Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. , 1989 .

[7]  R. Zwick When Do Item Response Function and Mantel-Haenszel Definitions of Differential Item Functioning Coincide? , 1990 .

[8]  Richard N. Jones Identification of Measurement Differences Between English and Spanish Language Versions of the Mini-Mental State Examination: Detecting Differential Item Functioning Using MIMIC Modeling , 2006, Medical care.

[9]  J. Teresi,et al.  Identification of Differential Item Functioning Using Item Response Theory and the Likelihood-Based Model Comparison Approach: Application to the Mini-Mental State Examination , 2006, Medical care.

[10]  S. Gregorich Do Self-Report Instruments Allow Meaningful Comparisons Across Diverse Population Groups?: Testing Measurement Invariance Using the Confirmatory Factor Analysis Framework , 2006, Medical care.

[11]  Bengt Muthén,et al.  Simultaneous factor analysis of dichotomous variables in several groups , 1981 .

[12]  W. Meredith Measurement invariance, factor analysis and factorial invariance , 1993 .

[13]  B. Muthén,et al.  Applying Multigroup Confirmatory Factor Models for Continuous Outcomes to Likert Scale Data Complicates Meaningful Group Comparisons , 2004 .

[14]  Roderick P. McDonald,et al.  Factor Analysis and Related Methods , 1985 .

[15]  Linda M. Collins,et al.  New methods for the analysis of change , 2001 .

[16]  William Meredith,et al.  The role of factorial invariance in modeling growth and change. , 2001 .

[17]  W. Meredith,et al.  Inferential Conditions in the Statistical Detection of Measurement Bias , 1992 .

[18]  Edward Kulick,et al.  Differential Item Functioning on the Mini-Mental State Examination: An Application of the Mantel-Haenszel and Standardization Procedures , 2006, Medical care.

[19]  B. Muthén A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators , 1984 .

[20]  Gideon J. Mellenbergh,et al.  Item bias and item response theory , 1989 .

[21]  Gerald van Belle,et al.  Differential Item Functioning Analysis With Ordinal Logistic Regression Techniques: DIFdetect and difwithpar , 2006, Medical care.

[22]  Anders Christoffersson,et al.  Factor analysis of dichotomized variables , 1975 .

[23]  Dorothy T. Thayer,et al.  Differential Item Performance and the Mantel-Haenszel Procedure. , 1986 .

[24]  Bengt Muthén,et al.  Factor Structure in Groups Selected on Observed Scores , 1989 .

[25]  R E Millsap,et al.  Statistical Evidence in Salary Discrimination Studies: Nonparametric Inferential Conditions. , 1994, Multivariate behavioral research.