Deriving Stopping Rules for Multidimensional Computerized Adaptive Testing

Multidimensional computerized adaptive testing (MCAT) is able to provide a vector of ability estimates for each examinee, which could be used to provide a more informative profile of an examinee’s performance. The current literature on MCAT focuses on the fixed-length tests, which can generate less accurate results for those examinees whose abilities are quite different from the average difficulty level of the item bank when there are only a limited number of items in the item bank. Therefore, instead of stopping the test with a predetermined fixed test length, the authors use a more informative stopping criterion that is directly related to measurement accuracy. Specifically, this research derives four stopping rules that either quantify the measurement precision of the ability vector (i.e., minimum determinant rule [D-rule], minimum eigenvalue rule [E-rule], and maximum trace rule [T-rule]) or quantify the amount of available information carried by each item (i.e., maximum Kullback–Leibler divergence rule [K-rule]). The simulation results showed that all four stopping rules successfully terminated the test when the mean squared error of ability estimation is within a desired range, regardless of examinees’ true abilities. It was found that when using the D-, E-, or T-rule, examinees with extreme abilities tended to have tests that were twice as long as the tests received by examinees with moderate abilities. However, the test length difference with K-rule is not very dramatic, indicating that K-rule may not be very sensitive to measurement precision. In all cases, the cutoff value for each stopping rule needs to be adjusted on a case-by-case basis to find an optimal solution.

[1]  Wen-Chung Wang,et al.  Improving measurement precision of test batteries using multidimensional item response models. , 2004, Psychological methods.

[2]  Wim J. van der Linden,et al.  Multidimensional adaptive testing with Kullback-Leibler information item selection , 2009 .

[3]  Z. Ying,et al.  a-Stratified Multistage Computerized Adaptive Testing with b Blocking , 2001 .

[4]  Seung W. Choi,et al.  Polytomous Models in Computerized Adaptive Testing , 2010 .

[5]  David J. Weiss,et al.  Factors Influencing the Psychometric Characteristics of an Adaptive Testing Strategy for Test Batteries. , 1981 .

[6]  David J. Weiss,et al.  APPLICATION OF COMPUTERIZED ADAPTIVE TESTING TO EDUCATIONAL PROBLEMS , 1984 .

[7]  Y. Chang,et al.  Fixed Size Confidence Regions for Parameters of a Logistic Regression Model , 1992 .

[8]  Barbara G Dodd,et al.  A New Stopping Rule for Computerized Adaptive Testing , 2011, Educational and psychological measurement.

[9]  Wim J. van der Linden,et al.  Multidimensional Adaptive Testing with Optimal Design Criteria for Item Selection , 2008, Psychometrika.

[10]  Hua-Hua Chang,et al.  Kullback–Leibler Information and Its Applications in Multi-Dimensional Adaptive Testing , 2011 .

[11]  M. Reckase Multidimensional Item Response Theory , 2009 .

[12]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[13]  Daniel O. Segall,et al.  Multidimensional adaptive testing , 1996 .

[14]  G. Crooks On Measures of Entropy and Information , 2015 .

[15]  Matthew Finkelman,et al.  A Conditional Exposure Control Method for Multidimensional Adaptive Testing , 2009 .

[16]  Willem J. van der Linden Assembling tests for the measurement of multiple traits. , 1996 .

[17]  Hua-Hua Chang,et al.  A Global Information Approach to Computerized Adaptive Testing , 1996 .

[18]  Willem J. van der Linden,et al.  Multidimensional Adaptive Testing with a Minimum Error-Variance Criterion , 1999 .

[19]  Richard M. Luecht,et al.  Multidimensional Computerized Adaptive Testing in a Certification or Licensure Context , 1996 .

[20]  Hua-Hua Chang,et al.  Item Selection in Multidimensional Computerized Adaptive Testing—Gaining Information from Different Angles , 2011 .

[21]  Bernard P. Veldkamp,et al.  Multidimensional adaptive testing with constraints on test content , 2002 .

[22]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[23]  Shelby J. Haberman,et al.  When Can Subscores Have Value? , 2008 .