Detection of Person Misfit in Computerized Adaptive Tests with Polytomous Items

Item scores that do not fit an assumed item response theory model may cause the latent trait value to be inaccurately estimated. For a computerized adaptive test (CAT) using dichotomous items, several person-fit statistics for detecting mis.tting item score patterns have been proposed. Both for paper-and-pencil (P&P) tests and CATs, detection ofperson mis.t with polytomous items is hardly explored. In this study, the nominal and empirical null distributions ofthe standardized log-likelihood statistic for polytomous items are compared both for P&P tests and CATs. Results showed that the empirical distribution of this statistic differed from the assumed standard normal distribution for both P&P tests and CATs. Second, a new person-fit statistic based on the cumulative sum (CUSUM) procedure from statistical process control was proposed. By means ofsimulated data, critical values were determined that can be used to classify a pattern as fitting or misfitting. The effectiveness of the CUSUM to detect simulees with item preknowledge was investigated. Detection rates using the CUSUM were high for realistic numbers ofdisclosed items.

[1]  Rob R. Meijer,et al.  CUSUM-Based Person-Fit Statistics for Adaptive Testing , 2001 .

[2]  Person Fit Across Subgroups: An Achievement Testing Example , 2001 .

[3]  D. Andrich Rating Scale Analysis , 1999 .

[4]  Michael L. Nering The Distribution of Indexes of Person Fit within the Computerized Adaptive Testing Environment , 1997 .

[5]  Ludovica Maria Wilhelmina Akkermans Studies on statistical models for polytomously scored test items , 1998 .

[6]  Kikumi K. Tatsuoka,et al.  Caution indices based on item response theory , 1984 .

[7]  M. Bax,et al.  Alternative Methods , 2020 .

[8]  Barbara G. Dodd,et al.  Computerized Adaptive Testing With Polytomous Items , 1995 .

[9]  B. Wright,et al.  Best test design , 1979 .

[10]  Rob R. Meijer,et al.  The Null Distribution of Person-Fit Statistics for Conventional and Adaptive Tests , 1999 .

[11]  Cornelis A.W. Glas,et al.  Computerized adaptive testing : theory and practice , 2000 .

[12]  E. Roskam,et al.  Conditions for rasch-dichotomizability of the unidimensional polytomous rasch model , 1989 .

[13]  Hua-Hua Chang,et al.  The unique correspondence of the item response function and item category response functions in polytomously scored item response models , 1994 .

[14]  H. Jane Rogers,et al.  A Monte Carlo Investigation of Several Person and Item Fit Statistics for Item Response Models , 1987 .

[15]  Fritz Drasgow,et al.  Appropriateness measurement with polychotomous item response models and standardized indices , 1984 .

[16]  Anne Boomsma,et al.  Essays on Item Response Theory , 2000 .

[17]  Klaas Sijtsma,et al.  Influence of Test and Person Characteristics on Nonparametric Appropriateness Measurement , 1994 .

[18]  Donald B. Rubin,et al.  Measuring the Appropriateness of Multiple-Choice Test Scores , 1979 .

[19]  Rob R. Meijer,et al.  The Number of Guttman Errors as a Simple and Powerful Person-Fit Statistic , 1994 .

[20]  Barbara G. Dodd,et al.  Computerized Adaptive Testing Using the Partial Credit Model: Effects Of Item Pool Characteristics and Different Stopping Rules , 1993 .

[21]  Steven P. Reise,et al.  The Influence of Test Characteristics on the Detection of Aberrant Response Patterns , 1991 .

[22]  Roger M. Sauter,et al.  Introduction to Statistical Quality Control (2nd ed.) , 1992 .

[23]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[24]  Tom A. B. Snijders,et al.  Asymptotic null distribution of person fit statistics with estimated person parameter , 2001 .

[25]  Fritz Drasgow,et al.  Detecting Faking on a Personality Instrument Using Appropriateness Measurement , 1996 .

[26]  William R. Koch,et al.  An Investigation of Procedures for Computerized Adaptive Testing Using Partial Credit Scoring , 1989 .

[27]  Paul Jansen,et al.  Latent trait models and dichotomization of graded responses , 1986 .

[28]  R. Hambleton,et al.  Item Response Theory , 1984, The History of Educational Measurement.

[29]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[30]  Klaas Sijtsma,et al.  Methodology Review: Evaluating Person Fit , 2001 .

[31]  R. J. Mokken,et al.  Handbook of modern item response theory , 1997 .

[32]  Rob R. Meijer,et al.  Detecting person misfit in adaptive testing using statistical process control techniques , 2000 .

[33]  G. Masters A rasch model for partial credit scoring , 1982 .