Variable-Length Stopping Rules for Multidimensional Computerized Adaptive Testing

In computerized adaptive testing (CAT), a variable-length stopping rule refers to ending item administration after a pre-specified measurement precision standard has been satisfied. The goal is to provide equal measurement precision for all examinees regardless of their true latent trait level. Several stopping rules have been proposed in unidimensional CAT, such as the minimum information rule or the maximum standard error rule. These rules have also been extended to multidimensional CAT and cognitive diagnostic CAT, and they all share the same idea of monitoring measurement error. Recently, Babcock and Weiss (J Comput Adapt Test 2012. https://doi.org/10.7333/1212-0101001) proposed an “absolute change in theta” (CT) rule, which is useful when an item bank is exhaustive of good items for one or more ranges of the trait continuum. Choi, Grady and Dodd (Educ Psychol Meas 70:1–17, 2010) also argued that a CAT should stop when the standard error does not change, implying that the item bank is likely exhausted. Although these stopping rules have been evaluated and compared in different simulation studies, the relationships among the various rules remain unclear, and therefore there lacks a clear guideline regarding when to use which rule. This paper presents analytic results to show the connections among various stopping rules within both unidimensional and multidimensional CAT. In particular, it is argued that the CT-rule alone can be unstable and it can end the test prematurely. However, the CT-rule can be a useful secondary rule to monitor the point of diminished returns. To further provide empirical evidence, three simulation studies are reported using both the 2PL model and the multidimensional graded response model.

[1]  Hua-Hua Chang,et al.  Combining CAT with cognitive diagnosis: A weighted item selection approach , 2012, Behavior research methods.

[2]  Barbara G. Dodd,et al.  Operational Characteristics of Adaptive Testing Procedures Using the Graded Response Model , 1989 .

[3]  Seung W. Choi,et al.  Polytomous Models in Computerized Adaptive Testing , 2010 .

[4]  P. Stratford,et al.  Simulated computerized adaptive tests for measuring functional status were efficient with good discriminant validity in patients with hip, knee, or foot/ankle impairments. , 2005, Journal of clinical epidemiology.

[5]  Barbara G. Dodd,et al.  Computerized Adaptive Testing Using the Partial Credit Model: Effects Of Item Pool Characteristics and Different Stopping Rules , 1993 .

[6]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1968 .

[7]  Hua-Hua Chang,et al.  Deriving Stopping Rules for Multidimensional Computerized Adaptive Testing , 2013 .

[8]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[9]  G. Makransky,et al.  The Applicability of Multidimensional Computerized Adaptive Testing for Cognitive Ability Measurement in Organizational Assessment , 2013 .

[10]  S. Embretson,et al.  Behind the Scenes: Using New Measurement Methods on DAS and KAIT , 1999 .

[11]  P. Fayers Applying item response theory and computer adaptive testing: the challenges for health outcomes assessment , 2007, Quality of Life Research.

[12]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[13]  S. A. Pickett-Schenk,et al.  Improving knowledge about mental illness through family-led education: the journey of hope. , 2008, Psychiatric services.

[14]  K. Baumstarck,et al.  A Multidimensional Computerized Adaptive Short-Form Quality of Life Questionnaire Developed and Validated for Multiple Sclerosis , 2016, Medicine.

[15]  D. Weiss,et al.  Robustness of Parameter Estimation to Assumptions of Normality in the Multidimensional Graded Response Model , 2018, Multivariate behavioral research.

[16]  Chun Wang,et al.  Sample Size Requirements for Estimation of Item Parameters in the Multidimensional Graded Response Model , 2016, Front. Psychol..

[17]  Hua-Hua Chang,et al.  Kullback–Leibler Information and Its Applications in Multi-Dimensional Adaptive Testing , 2011 .

[18]  David J. Weiss,et al.  Using computerized adaptive testing to reduce the burden of mental health assessment. , 2008, Psychiatric services.

[19]  David J. Weiss,et al.  Better Data From Better Measurements Using Computerized Adaptive Testing , 2011 .

[20]  Hua-Hua Chang,et al.  Constraint-Weighted a-Stratification for Computerized Adaptive Testing With Nonstatistical Constraints , 2009 .

[21]  N. Kar,et al.  Toxoplasma seropositivity and depression: a case report , 2004, BMC psychiatry.

[22]  Daniel O. Segall,et al.  Multidimensional adaptive testing , 1996 .

[23]  Barbara G Dodd,et al.  A New Stopping Rule for Computerized Adaptive Testing , 2011, Educational and psychological measurement.

[24]  Wim J. van der Linden,et al.  Multidimensional Adaptive Testing with Optimal Design Criteria for Item Selection , 2008, Psychometrika.

[25]  Kimberly S. Maier,et al.  Using a Multivariate Multilevel Polytomous Item Response Theory Model to Study Parallel Processes of Change: The Dynamic Association Between Adolescents' Social Isolation and Engagement With Delinquent Peers in the National Youth Survey , 2010, Multivariate behavioral research.

[26]  David J. Weiss,et al.  APPLICATION OF COMPUTERIZED ADAPTIVE TESTING TO EDUCATIONAL PROBLEMS , 1984 .

[27]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[28]  Michael L. Nering,et al.  Handbook of Polytomous Item Response Theory Models , 2010 .

[29]  K. Cook,et al.  Simulated computerized adaptive test for patients with shoulder impairments was efficient and produced valid measures of function. , 2006, Journal of clinical epidemiology.

[30]  Daniel J Buysse,et al.  Computerized adaptive measurement of depression: A simulation study , 2004, BMC psychiatry.

[31]  E. Taal,et al.  Working mechanism of a multidimensional computerized adaptive test for fatigue in rheumatoid arthritis , 2015, Health and Quality of Life Outcomes.

[32]  Chun Wang,et al.  On Latent Trait Estimation in Multidimensional Compensatory Item Response Models , 2015, Psychometrika.

[33]  Hua-Hua Chang,et al.  To Weight or Not to Weight? Balancing Influence of Initial Items in Adaptive Testing , 2007 .

[34]  Hua-Hua Chang,et al.  Item Selection in Multidimensional Computerized Adaptive Testing—Gaining Information from Different Angles , 2011 .

[35]  Bernard P. Veldkamp,et al.  Multidimensional adaptive testing with constraints on test content , 2002 .

[36]  Chun Wang,et al.  Improving Measurement Precision of Hierarchical Latent Traits Using Adaptive Testing , 2014 .