Evaluation of Linking Methods for Placing Three-Parameter Logistic Item Parameter Estimates onto a One-Parameter Scale.

Different item response theory (IRT) models may be employed for item calibration. Change of testing vendors, for example, may result in the adoption of a different model than that previously used with a testing program. To provide scale continuity and preserve cut score integrity, item parameter estimates from the new model must be linked to the item parameter estimates obtained from the previous model. Given that the assumptions of different models vary, it is necessary to identify linking methods that best place item parameters scaled using the new model to item parameters scaled using the old model. In this study, we explore the results of equating 3PL parameter estimates to 1PL parameter estimates, using Moment, Characteristic Curve, and Theta Regression methods. The data set consists of 31,813 student responses to a 78 item, multiple choice, End-of-Instruction exam. The evaluation criteria include the impact of different linking methods on scale score means and standard deviations, scale score frequency distributions, Test Characteristic Curves and Standard Error Curves, test information, and the classification of students into the different proficiency levels. The Characteristic Curve linking methods best aligned the 3PL scale to the 1PL scale. From the results, if aligning the mean and SD of the scale score distribution is perceived to be most important, then the Stocking and Lord method is preferable. If the classification of students into different performance categories is deemed most important, then the Haebara method is recommended. In either case, the differences are trivial.