Reliability estimation in a multilevel confirmatory factor analysis framework.

Scales with varying degrees of measurement reliability are often used in the context of multistage sampling, where variance exists at multiple levels of analysis (e.g., individual and group). Because methodological guidance on assessing and reporting reliability at multiple levels of analysis is currently lacking, we discuss the importance of examining level-specific reliability. We present a simulation study and an applied example showing different methods for estimating multilevel reliability using multilevel confirmatory factor analysis and provide supporting Mplus program code. We conclude that (a) single-level estimates will not reflect a scale's actual reliability unless reliability is identical at each level of analysis, (b) 2-level alpha and composite reliability (omega) perform relatively well in most settings, (c) estimates of maximal reliability (H) were more biased when estimated using multilevel data than either alpha or omega, and (d) small cluster size can lead to overestimates of reliability at the between level of analysis. We also show that Monte Carlo confidence intervals and Bayesian credible intervals closely reflect the sampling distribution of reliability estimates under most conditions. We discuss the estimation of credible intervals using Mplus and provide R code for computing Monte Carlo confidence intervals.

[1]  D. Borsboom,et al.  Can genetics help psychometrics? Improving dimensionality assessment through genetic factor modeling. , 2013, Psychological methods.

[2]  Joop J. Hox,et al.  How few countries will do? Comparative survey analysis from a Bayesian perspective , 2012 .

[3]  L. Weng,et al.  Estimating the Reliability of Aggregated and Within-Person Centered Scores in Ecological Momentary Assessment , 2012, Multivariate behavioral research.

[4]  Kristopher J Preacher,et al.  Advantages of Monte Carlo Confidence Intervals for Indirect Effects , 2012 .

[5]  K. Petrides Introduction to Psychometric Theory , 2011 .

[6]  Bengt Muthén,et al.  Mean and Covariance Structure Analysis of Hierarchical Data , 2011 .

[7]  Bradley Efron,et al.  The Bootstrap and Markov-Chain Monte Carlo , 2011, Journal of biopharmaceutical statistics.

[8]  J. Kyle Roberts,et al.  Handbook of advanced multilevel analysis , 2011 .

[9]  Bengt Muthén,et al.  Beyond multilevel regression modeling: Multilevel analysis in a general latent variable framework. , 2011 .

[10]  Tom A. B. Snijders,et al.  Multilevel Analysis , 2011, International Encyclopedia of Statistical Science.

[11]  Spiridon Penev,et al.  Evaluation of Reliability Coefficients for Two-Level Models via Latent Variable Analysis , 2010 .

[12]  F. Morrison,et al.  First graders' literacy and self-regulation gains: The effect of individualizing student instruction. , 2010, Journal of school psychology.

[13]  Harvey Goldstein,et al.  Bootstrapping in Multilevel Models , 2010 .

[14]  Alberto Maydeu-Olivares,et al.  Factor Analysis with Ordinal Indicators: A Monte Carlo Study Comparing DWLS and ULS Estimation , 2009 .

[15]  A. Panter,et al.  The Effects of Educational Diversity in a National Sample of Law Students: Fitting Multilevel Latent Variable Models in Data With Categorical Indicators , 2009, Multivariate behavioral research.

[16]  Jaak Billiet,et al.  A Monte Carlo sample size study: How many countries are needed for accurate multilevel SEM? , 2009 .

[17]  Klaas Sijtsma,et al.  On the Use, the Misuse, and the Very Limited Usefulness of Cronbach’s Alpha , 2008, Psychometrika.

[18]  Trevor Williams,et al.  TIMSS 2007 U.S. Technical Report and User Guide , 2009 .

[19]  T Asparouhov,et al.  Muthén, B., & Growth mixture analysis: Analysis with non-Gaussian random effects. , 2008 .

[20]  B. Muthén,et al.  Growth mixture modeling , 2008 .

[21]  P. Wilhelm,et al.  Assessing Mood in Daily Life Structural Validity, Sensitivity to Change, and Reliability of a Short-Scale to Measure Three Basic Dimensions of Mood , 2007 .

[22]  Dato N.M. De Gruijter,et al.  Statistical Test Theory for the Behavioral Sciences , 2007 .

[23]  Peter M. Bentler,et al.  Covariance Structure Models for Maximal Reliability of Unit-Weighted Composites , 2007 .

[24]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[25]  Masumi Iida,et al.  A Procedure for Evaluating Sensitivity to Within-Person Change: Can Mood Measures in Diary Studies Detect Change Reliably? , 2006, Personality & social psychology bulletin.

[26]  Donald Hedeker,et al.  Longitudinal Data Analysis , 2006 .

[27]  George A. Marcoulides,et al.  On Multilevel Model Reliability Estimation From the Perspective of Structural Equation Modeling , 2006 .

[28]  T. Brown,et al.  Confirmatory Factor Analysis for Applied Research , 2006 .

[29]  T. Raykov,et al.  Estimation of Reliability for Multiple-Component Measuring Instruments in Hierarchical Designs , 2005 .

[30]  G. Molenberghs Applied Longitudinal Analysis , 2005 .

[31]  William Revelle,et al.  Cronbach’s α, Revelle’s β, and Mcdonald’s ωH: their relations with each other and two alternative conceptualizations of reliability , 2005 .

[32]  Kimberly L. Kempf Encyclopedia of social measurement , 2005 .

[33]  W. Kendall,et al.  A General Model for the Analysis of Mark‐Resight, Mark‐Recapture, and Band‐Recovery Data under Tag Loss , 2004, Biometrics.

[34]  Keith E. Muller,et al.  Exact distributions of intraclass correlation and Cronbach's alpha with Gaussian data and general covariance , 2004, Psychometrika.

[35]  C. Halaby,et al.  Panel Models in Sociological Research: Theory into Practice , 2004 .

[36]  Tenko Raykov,et al.  Estimation of maximal reliability: a note on a covariance structure modelling approach. , 2004, The British journal of mathematical and statistical psychology.

[37]  David P Mackinnon,et al.  Confidence Limits for the Indirect Effect: Distribution of the Product and Resampling Methods , 2004, Multivariate behavioral research.

[38]  Micah Altman,et al.  Encyclopedia of Social Measurement , 2004 .

[39]  Nigel O'Brian,et al.  Generalizability Theory I , 2003 .

[40]  Patrick E. Shrout,et al.  Reliability of Scales With General Structure: Point and Interval Estimation Using a Structural Equation Modeling Approach , 2002 .

[41]  T. Raykov Analytic Estimation of Standard Error and Confidence Interval for Scale Reliability , 2002, Multivariate behavioral research.

[42]  Jeffrey M. Woodbridge Econometric Analysis of Cross Section and Panel Data , 2002 .

[43]  D. A. Kenny,et al.  The statistical analysis of data from small groups. , 2002, Journal of personality and social psychology.

[44]  Jeffrey M. Wooldridge,et al.  Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data , 2003 .

[45]  Brendan Bunting Structural equation modeling: Present and future. A festschrift in honor of Karl Joreskog. , 2001 .

[46]  Kristopher J Preacher,et al.  Sample Size in Factor Analysis: The Role of Model Error , 2001, Multivariate behavioral research.

[47]  S. Kozlowski,et al.  Multilevel Theory, Research, a n d M e t h o d s i n Organizations Foundations, Extensions, and New Directions , 2022 .

[48]  R. P. McDonald,et al.  Test Theory: A Unified Treatment , 1999 .

[49]  R. MacCallum,et al.  Sample size in factor analysis. , 1999 .

[50]  Tenko Raykov,et al.  A Method for Obtaining Standard Errors and Confidence Intervals of Composite Reliability for Congeneric Items , 1998 .

[51]  D. Chan Functional Relations among Constructs in the Same Content Domain at Different Levels of Analysis: A Typology of Composition Models , 1998 .

[52]  T. Raykov Estimation of Composite Reliability for Congeneric Measures , 1997 .

[53]  Heng Li A unifying expression for the maximal reliability of a linear composite , 1997 .

[54]  B. Efron,et al.  Bootstrap confidence intervals , 1996 .

[55]  B. Muthén,et al.  Multilevel Covariance Structure Analysis , 1994 .

[56]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[57]  Harvey Goldstein,et al.  A general model for the analysis of multilevel data , 1988 .

[58]  Noreen M. Webb,et al.  Using Generalizability Theory in Counseling and Development. , 1988 .

[59]  Bengt Muthén,et al.  On structural equation modeling with data that are not missing completely at random , 1987 .

[60]  Maximally Reliable Composites for Unidimensional Measures , 1980 .

[61]  K. Jöreskog,et al.  Intraclass Reliability Estimates: Testing Structural Assumptions , 1974 .

[62]  Roderick P. McDonald,et al.  THE THEORETICAL FOUNDATIONS OF PRINCIPAL FACTOR ANALYSIS, CANONICAL FACTOR ANALYSIS, AND ALPHA FACTOR ANALYSIS , 1970 .

[63]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[64]  M. R. Novick,et al.  Coefficient alpha and the reliability of composite measurements. , 1967, Psychometrika.

[65]  L. Cronbach Coefficient alpha and the internal structure of tests , 1951 .

[66]  Louis Guttman,et al.  A basis for analyzing test-retest reliability , 1945, Psychometrika.

[67]  G. Thomson WEIGHTING FOR BATTERY RELIABILITY AND PREDICTION , 1940 .

[68]  M. W. Richardson,et al.  The theory of the estimation of test reliability , 1937 .