A comparison of item selection rules including precision, content, and exposure considerations at the early stages of computerized adaptive testing

It is well known that item selection at the early stages of a computerized adaptive test (CAT) is not as appropriate as it is at the later stages. Several new item selection rules have been proposed to improve early-stage item selection. Based on research, these new rules tend to improve item selection at the early stages of a CAT, especially at the extreme negative trait levels. Furthermore, the new rules perform as well as the Fisher information rule at the later stages of CATs. The apparent benefits of the new item selection rules, however, were observed in an idealistic CAT setting in which an information function was the only criterion used in item selection. In more realistic CAT situations, where content balancing and item exposure control are criteria for item selection in addition to an information function, the relationships among the item selection rules may be different. The purpose of this study was to compare the effects of four item selection rules-Fisher information (F), Fisher information with a posterior distribution (FP), Kullback-Leibler information with a posterior distribution (KP), and completely randomized item selection (RN)—with respect to the precision of trait estimation and the extent of item usage at the early stages of a CAT. The comparison of the four item selection rules was carried out under three conditions: (a) using only the item information function as the item selection criterion; (b) using both the item information function and content balancing; and (c) using the item information function, content balancing, and item exposure control. Based on this study, when test length was less than 10 items, FP and KP tended to outperform F when the only item selection criterion was item information. However, in realistic CAT settings that included content balancing and item exposure as item selection criteria, FP and KP did not significantly outperform the traditional Fisher information rule. When test length was greater than 10 items, the three item selection procedures performed similarly no matter what the item selection criteria were, and F yielded slightly higher item usage than FP or KP.