Estimating Measurement Precision in Reduced-Length Multistage-Adaptive Testing

The extent to which reducing the number of items in a multi-stage adaptive test (MST) affected measurement precision was evaluated. Using the Massachusetts Adult Proficiency Test for Reading (MAPT), a low-stakes MST in adult education, reliability, decision consistency, and decision accuracy estimates were compared for the original and reduced-length tests (from 40 to 35 items). Four different approaches were used: (1) the Spearman-Brown formula, (2) eliminating one item of average discrimination from consecutive stages, (3) completely reassembling new panels, and (4) simulating item responses to the original and shortened MSTs and comparing the standard errors of measurement for simulated examinees. Overall, results suggested comparable levels of measurement precision, improved content representation, and reduced testing time were achievable using the reduced-length tests. The Spearman-Brown estimates were surprisingly close to the estimates based on assembling new panels. Methods for assembling an MST to maintain measurement precision and practical lessons that could generalize to MSTs in other contexts are discussed. Keywords: multi-stage adaptive testing, reliability, decision consistency, decision accuracy, response time, test development, validity.