The Effect of Model Misspecification on Classification Decisions Made Using a Computerized Test

Many computerized testing algorithms require the fitting of some item response theory (IRT) model to examinees' responses to facilitate item selection, the determination of test stopping rules, and classification decisions. Some IRT models are thought to be particularly useful for small volume certification programs that wish to make the transition to computerized adaptive testing (CAT). The one-parameter logistic model (1-PLM) is usually assumed to require a smaller sample size than the three-parameter logistic model (3-PLM) for item parameter calibrations. This study examined the effects of model misspecification on the precision of the decisions made using the sequential probability ratio test (SPRT). For this comparison, the 1-PLM was used to estimate item parameters, even though the items' characteristics were represented by a 3-PLM. Results demonstrated that the I-PLM produced considerably more decision errors under simulation conditions similar to a real testing environment, compared to the true model and to a fixed-form standard reference set of items. In certification and licensure testing, a balance must be maintained between minimizing costs for clients and ensuring the protection of the public in terms of making valid decisions regarding minimum competency. To remain competitive, a testing organization must be able to offer clients testing services at the lowest price allowable while providing good measurement. These services frequently include computerized testing. There are many forms of computerized testing available today. These vary from the simple administration of a fixed form on a computer to computerized adaptive testing for the estimation of a broad range of examinee abilities. The primary focus of this paper is related to item response theory model selection and the computerized classification test (CCT). A CCT is an examination that is designed to function optimally at making pass/fail decisions. When determining if a credentialing or licensure client is ready to implement CCT, many factors are taken into consideration. These factors include the current size and status of the client's item pool, the testing volume (i.e., the number of examinees who take each test form), the test administration frequency, the frequency of pretesting, the nature of the examinee population (i.e., first-time test takers versus recertifiers or advanced-level practitioners), and the current test blueprint or test content outline. It is desirable, from the client's point of view, to know the predicted impact of making the transition from a fixed-length, fixed-form testing format to a test of variable length consisting of different items for different examinees. The outcomes or effects of a change in testing format could include changes in the average testing time or test length for an examinee, the average