Characterizing Sources of Uncertainty in Item Response Theory Scale Scores

Traditional estimators of item response theory scale scores ignore uncertainty carried over from the item calibration process, which can lead to incorrect estimates of the standard errors of measurement (SEMs). Here, the authors review a variety of approaches that have been applied to this problem and compare them on the basis of their statistical methods and goals. They then elaborate on the particular flexibility and usefulness of a multiple imputation–based approach, which can be easily applied to tests with mixed item types and multiple underlying dimensions. This proposed method obtains corrected estimates of individual scale scores, as well as their SEMs. Furthermore, this approach enables a more complete characterization of the impact of parameter uncertainty by generating confidence envelopes (intervals) for item trace lines, test information functions, conditional SEM curves, and the marginal reliability coefficient. The multiple imputation–based approach is illustrated through the analysis of an artificial data set, then applied to data from a large educational assessment. A simulation study was also conducted to examine the relative contribution of item parameter uncertainty to the variability in score estimates under various conditions. The authors found that the impact of item parameter uncertainty is generally quite small, though there are some conditions under which the uncertainty carried over from item calibration contributes substantially to variability in the scores. This may be the case when the calibration sample is small relative to the number of item parameters to be estimated or when the item response theory model fit to the data is multidimensional.

[1]  David E. Booth,et al.  Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[2]  R. Tsutakawa,et al.  The effect of uncertainty of item parameter estimation on ability estimates , 1990 .

[3]  Anne Boomsma,et al.  Essays on Item Response Theory , 2000 .

[4]  Donald Hedeker,et al.  Full-information item bi-factor analysis , 1992 .

[5]  Robert K. Tsutakawa,et al.  Approximation for Bayesian Ability Estimation , 1988 .

[6]  T. Louis,et al.  Bayes and Empirical Bayes Methods for Data Analysis. , 1997 .

[7]  A. V. D. Vaart,et al.  Asymptotic Statistics: U -Statistics , 1998 .

[8]  Li Cai,et al.  Generalized full-information item bifactor analysis. , 2011, Psychological methods.

[9]  H. Wainer,et al.  Confidence Envelopes for Item Response Theory , 1983 .

[10]  Susan E. Embretson,et al.  Generating items during testing: Psychometric issues and models , 1999 .

[11]  Richard J. Patz,et al.  A Straightforward Approach to Markov Chain Monte Carlo Methods for Item Response Models , 1999 .

[12]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[13]  Li Cai,et al.  SEM of another flavour: two new applications of the supplemented EM algorithm. , 2008, The British journal of mathematical and statistical psychology.

[14]  Matthew S. Johnson,et al.  Marginal Maximum Likelihood Estimation of Item Response Models in R , 2007 .

[15]  Frank B. Baker,et al.  Item Response Theory : Parameter Estimation Techniques, Second Edition , 2004 .

[16]  Raymond J. Adams,et al.  PISA 2000 technical report , 2002 .

[17]  Ke-Hai Yuan,et al.  The Impact of Fallible Item Parameter Estimates on Latent Trait Recovery , 2010, Psychometrika.

[18]  Mark D. Reckase,et al.  Item Response Theory: Parameter Estimation Techniques , 1998 .

[19]  Robert J. Mislevy,et al.  DEALING WITH UNCERTAINTY ABOUT ITEM PARAMETERS: EXPECTED RESPONSE FUNCTIONS , 1994 .

[20]  Xiao-Li Meng,et al.  Using EM to Obtain Asymptotic Variance-Covariance Matrices: The SEM Algorithm , 1991 .

[21]  Frederic M. Lord,et al.  Comparison of IRT True-Score and Equipercentile Observed-Score "Equatings" , 1984 .

[22]  K. Shigemasu,et al.  Standard Errors of Estimated Latent Variable Scores With Estimated Structural Parameters , 2008 .

[23]  R. Maruyama,et al.  On Test Scoring , 1927 .

[24]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[25]  F. Baker,et al.  Item response theory : parameter estimation techniques , 1993 .

[26]  A. Rukhin Bayes and Empirical Bayes Methods for Data Analysis , 1997 .

[27]  Minge Xie,et al.  Investigating the Impact of Uncertainty About Item Parameters on Ability Estimation , 2011 .

[28]  A. V. D. Vaart,et al.  Asymptotic Statistics: Frontmatter , 1998 .

[29]  Charles Lewis Expected Response Functions , 2001 .