Properties of Ability Estimation Methods in Computerized Adaptive Testing

Simulations of computerized adaptive tests (CATs) were used to evaluate results yielded by four commonly used ability estimation methods: maximum likelihood estimation (MLE) and three Bayesian approaches—Owen's method, expected a posteriori (EAP), and maximum a posteriori. In line with the theoretical nature of the ability estimates and previous empirical research, the results showed clear distinctions between MLE and the Bayesian methods, with MLE yielding lower bias, higher standard errors, higher root mean square errors, lower fidelity, and lower administrative efficiency. Standard errors for MLE based on test information underestimated actual standard errors, whereas standard errors for the Bayesian methods based on posterior distribution standard deviations accurately estimated actual standard errors. Among the Bayesian methods, Owen's provided the worst overall results, and EAP provided the best. Using a variable starting rule in which examinees were initially classified into three broad/ability groups greatly reduced the bias for the Bayesian methods, but had little effect on the results for MLE. On the basis of these results, guidelines are offered for selecting appropriate CAT ability estimation methods in different decision contexts.

[1]  Carl J. Jensema Bayesian Tailored Testing and the Influence of Item Bank Characteristics , 1977 .

[2]  An approximation for the bias function of the maximum likelihood estimate of a latent variable for the general case where the item responses are discrete , 1993 .

[3]  T. A. Warm Weighted likelihood estimation of ability in item response theory , 1989 .

[4]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1968 .

[5]  Fumiko Samejima The bias function of the maximum likelihood estimate of ability for the dichotomous response level , 1993 .

[6]  Martha L. Stocking,et al.  A Method for Severely Constrained Item Selection in Adaptive Testing , 1992 .

[7]  R. Owen,et al.  A Bayesian Sequential Procedure for Quantal Response in the Context of Adaptive Mental Testing , 1975 .

[8]  Frederic M. Lord MAXIMUM LIKELIHOOD AND BAYESIAN PARAMETER ESTIMATION IN ITEM RESPONSE THEORY , 1986 .

[9]  Roberto de-la-Torre,et al.  The Development and Evaluation of a Computerized Adaptive Testing System. , 1991 .

[10]  Mark D. Reckase,et al.  Comparison of SPRT and Sequential Bayes Procedures for Classifying Examinees Into Two Categories Using a Computerized Test , 1996 .

[11]  David J. Weiss,et al.  Improving Measurement Quality and Efficiency with Adaptive Testing , 1982 .

[12]  Tianyou Wang,et al.  Computerized Adaptive and Fixed‐Item Testing of Music Listening Skill: A Comparison of Efficiency, Precision, and Concurrent Validity , 1997 .

[13]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[14]  William D. Schafer,et al.  AN INVESTIGATION OF THE STANDARD ERRORS OF EXPECTED A POSTERIORI ABILITY ESTIMATES , 1995 .

[15]  Judith A. Spray Multiple-Category Classification Using a Sequential Probability Ratio Test. , 1993 .

[16]  W. Alan Nicewander,et al.  Ability estimation for conventional tests , 1993 .

[17]  Roger J. Owen ERRATUM FOR A BAYESIAN APPROACH TO TAILORED TESTING (RB–69–92) , 1969 .

[18]  Martha L. Stocking,et al.  A New Method of Controlling Item Exposure in Computerized Adaptive Testing. , 1995 .

[19]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters , 1982 .

[20]  R. D. Bock,et al.  Adaptive EAP Estimation of Ability in a Microcomputer Environment , 1982 .

[21]  David J. Weiss,et al.  Bias and Information of Bayesian Adaptive Testing , 1984 .

[22]  Arthur R. Jensen,et al.  Armed Services Vocational Aptitude Battery. , 1985 .

[23]  David J. Weiss,et al.  APPLICATION OF COMPUTERIZED ADAPTIVE TESTING TO EDUCATIONAL PROBLEMS , 1984 .

[24]  Roger J. Owen A BAYESIAN APPROACH TO TAILORED TESTING , 1969 .