A Comparison of Linking and Concurrent Calibration Under the Graded Response Model

Developing a common metric is essential to successful applications of item response theory to practical testing problems, such as equating, differential item functioning, and computerized adaptive testing. In this study, the authors compared two methods for developing a common metric for the graded response model under item response theory: (a) linking separate calibration runs using equating coefficients from the characteristic curve method and (b) concurrent calibration using the combined data of the base and target groups. Concurrent calibration yielded consistently albeit only slightly smaller root mean square differences for both item discrimination and location parameters. Similar results were observed for distance measures between item parameter estimates and item parameters. Concurrent calibration also yielded consistently though only slightly smaller root mean square differences for ability than linking.

[1]  F. Baker Equating Tests Under the Graded Response Model , 1992 .

[2]  James L. Wardrop,et al.  Item Bias in a Test of Reading Comprehension , 1981 .

[3]  Allan S. Cohen,et al.  An Investigation of Linking Methods Under the Graded Response Model , 1998 .

[4]  Martha L. Stocking,et al.  Developing a Common Metric in Item Response Theory , 1982 .

[5]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[6]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[7]  R. Hambleton,et al.  Item Response Theory , 1984, The History of Educational Measurement.

[8]  Tomokazu Haebara,et al.  EQUATING LOGISTIC ABILITY SCALES BY A WEIGHTED LEAST SQUARES METHOD , 1980 .

[9]  Seock-Ho Kim,et al.  A Comparison of Linking and Concurrent Calibration Under Item Response Theory , 1996 .

[10]  A Minimum Chi-Square Method for Developing a Common Metric in Item Response Theory , 1985 .

[11]  C. David Vale,et al.  Linking Item Parameters Onto a Common Scale , 1986 .

[12]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1968 .

[13]  F. Samejima A General Model for Free Response Data. , 1972 .

[14]  A Minimum χ2 Method for Equating Tests Under the Graded Response Model , 1995 .

[15]  Gary L. Marco,et al.  Item characteristic curve solutions to three intractable testing problems. , 1977 .

[16]  Linda L. Cook,et al.  Irt Versus Conventional Equating Methods: A Comparative Study of Scale Stability , 1983 .

[17]  Linda L. Cook,et al.  SPECIFYING THE CHARACTERISTICS OF LINKING ITEMS USED FOR ITEM RESPONSE THEORY ITEM CALIBRATION1,2 , 1987 .

[18]  S. Reise,et al.  Parameter Recovery in the Graded Response Model Using MULTILOG , 1990 .

[19]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters , 1982 .

[20]  Frank B. Baker EQUATE 2.0: A Computer Program for the Characteristic Curve Method of IRT Equating , 1993 .

[21]  R. Hambleton,et al.  Item Response Theory: Principles and Applications , 1984 .

[22]  Brenda H. Loyd,et al.  VERTICAL EQUATING USING THE RASCH MODEL , 1980 .