论文信息 - Evaluation of Linking Methods for Placing Three-Parameter Logistic Item Parameter Estimates onto a One-Parameter Scale.

Evaluation of Linking Methods for Placing Three-Parameter Logistic Item Parameter Estimates onto a One-Parameter Scale.

Different item response theory (IRT) models may be employed for item calibration. Change of testing vendors, for example, may result in the adoption of a different model than that previously used with a testing program. To provide scale continuity and preserve cut score integrity, item parameter estimates from the new model must be linked to the item parameter estimates obtained from the previous model. Given that the assumptions of different models vary, it is necessary to identify linking methods that best place item parameters scaled using the new model to item parameters scaled using the old model. In this study, we explore the results of equating 3PL parameter estimates to 1PL parameter estimates, using Moment, Characteristic Curve, and Theta Regression methods. The data set consists of 31,813 student responses to a 78 item, multiple choice, End-of-Instruction exam. The evaluation criteria include the impact of different linking methods on scale score means and standard deviations, scale score frequency distributions, Test Characteristic Curves and Standard Error Curves, test information, and the classification of students into the different proficiency levels. The Characteristic Curve linking methods best aligned the 3PL scale to the 1PL scale. From the results, if aligning the mean and SD of the scale score distribution is perceived to be most important, then the Stocking and Lord method is preferable. If the classification of students into different performance categories is deemed most important, then the Haebara method is recommended. In either case, the differences are trivial.

Thakur B. Karkee | Karen R. Wright | Karen R. Wright

[1] D. R. Divgi. DOES THE RASCH MODEL REALLY WORK FOR MULTIPLE CHOICE ITEMS? NOT IF YOU LOOK CLOSELY , 1986 .

[2] E. Muraki. A GENERALIZED PARTIAL CREDIT MODEL: APPLICATION OF AN EM ALGORITHM , 1992 .

[3] F. Baker,et al. A Comparison of Two Procedures for Computing IRT Equating Coefficients , 1991 .

[4] Tomokazu Haebara,et al. EQUATING LOGISTIC ABILITY SCALES BY A WEIGHTED LEAST SQUARES METHOD , 1980 .

[5] Martha L. Stocking,et al. Developing a Common Metric in Item Response Theory , 1982 .

[6] Brenda H. Loyd,et al. VERTICAL EQUATING USING THE RASCH MODEL , 1980 .

[7] Georg Rasch,et al. Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[8] Gary L. Marco,et al. Item characteristic curve solutions to three intractable testing problems. , 1977 .

[9] Wendy M. Yen,et al. Scaling Performance Assessments: Strategies for Managing Local Item Dependence , 1993 .

[10] F. Lord. Applications of Item Response Theory To Practical Testing Problems , 1980 .

[11] R. Brennan,et al. Test equating : methods and practices , 1995 .