论文信息 - The Null Distribution of Person-Fit Statistics for Conventional and Adaptive Tests - 字舞流文

The Null Distribution of Person-Fit Statistics for Conventional and Adaptive Tests

Several person-fit statistics have been proposed to detect item score patterns that do not fit an item response theory model. To classify response patterns as misfitting, the distribution of a person-fit statistic is needed. The theoretical null distributions of several fit statistics have been derived for paper-and-pencil (P&P) tests. However, it is unknown whether these distributions also hold for computerized adaptive tests (CAT). A three-part simulation study was conducted. In the first study, the theoretical distribution of the l z statistic across trait. θlevels for CAT and P&P tests was investigated. The distribution of the l* z statistic proposed by Snijders (in press) was also investigated. Results indicated that the distribution of both l z and l* z differed from the theoretical distribution in CAT. The second study examined the distributions of l z and l* z using simulation. These simulated distributions, when based on O [UNKNOWN], were found to be problematic in CAT. In the third study, the detection rates of l* z and l z were compared. The rates for both statistics were found to be similar in most cases.

Rob R. Meijer | R. Meijer | Edith M. L. A. Van Krimpen-Stoop | E. M. van Krimpen-Stoop

[1] M. Liou,et al. Constructing the exact significance level for a person fit statistic , 1992 .

[2] R. Hambleton,et al. Item Response Theory , 1984, The History of Educational Measurement.

[3] Rob R. Meijer,et al. Detecting person misfit in adaptive testing using statistical process control techniques , 2000 .

[4] S. Reise,et al. Fitting the Two-Parameter Model to Personality Data , 1990 .

[5] Steven P. Reise,et al. Traitedness and the assessment of response pattern scalability , 1993 .

[6] T. A. Warm. Weighted likelihood estimation of ability in item response theory , 1989 .

[7] David B. Dunson,et al. Bayesian Data Analysis , 2010 .

[8] Fritz Drasgow,et al. Appropriateness Measurement for Some Multidimensional Test Batteries , 1991 .

[9] K. C. Klauer. An exact and optimal standardized person test for assessing consistency with the rasch model , 1991 .

[10] Fritz Drasgow,et al. Appropriateness measurement with polychotomous item response models and standardized indices , 1984 .

[11] R. Tibshirani,et al. An introduction to the bootstrap , 1993 .

[12] Fritz Drasgow,et al. Optimal Detection of Certain Forms of Inappropriate Test Scores , 1986 .

[13] I. W. Molenaar,et al. Rasch models: foundations, recent developments and applications , 1995 .

[14] Klaas Sijtsma,et al. Influence of Test and Person Characteristics on Nonparametric Appropriateness Measurement , 1994 .

[15] Fritz Drasgow,et al. Optimal Identification of Mismeasured Individuals. , 1996 .

[16] Karl Christoph Klauer. The Assessment of Person Fit , 1995 .

[17] B. Efron. The jackknife, the bootstrap, and other resampling plans , 1987 .

[18] R. Hambleton,et al. Item Response Theory: Principles and Applications , 1984 .

[19] Michael L. Nering. The Distribution of Indexes of Person Fit within the Computerized Adaptive Testing Environment , 1997 .

[20] Fritz Drasgow,et al. Detecting Faking on a Personality Instrument Using Appropriateness Measurement , 1996 .

[21] Edward J. Bedrick. Approximating the conditional distribution of person FIT indexes for checking the rasch model , 1997 .

[22] D. Rubin,et al. MEASURING THE APPROPRIATENESS OF MULTIPLE‐CHOICE TEST SCORES1,2 , 1976 .

[23] Rob R. Meijer,et al. The Number of Guttman Errors as a Simple and Powerful Person-Fit Statistic , 1994 .

[24] Rob R. Meijer,et al. Trait Level Estimation for Nonfitting Response Vectors , 1997 .

[25] Rob R. Meijer,et al. CUSUM-Based Person-Fit Statistics for Adaptive Testing , 2001 .

[26] Steven P. Reise,et al. Scoring Method and the Detection of Person Misfit in a Personality Assessment Context , 1995 .

[27] Herbert Hoijtink,et al. The many null distributions of person fit indices , 1990 .

[28] Kikumi K. Tatsuoka,et al. Caution indices based on item response theory , 1984 .

[29] Fritz Drasgow,et al. Detecting Inappropriate Test Scores with Optimal and Practical Appropriateness Indices , 1987 .

[30] F. Baker,et al. Item response theory : parameter estimation techniques , 1993 .

[31] Donald B. Rubin,et al. Measuring the Appropriateness of Multiple-Choice Test Scores , 1979 .

[32] Steven P. Reise,et al. The Influence of Test Characteristics on the Detection of Aberrant Response Patterns , 1991 .

[33] Rob R. Meijer,et al. Statistical Tests for Person Misfit in Computerized Adaptive Testing. Research Report 98-01. , 1998 .

[34] Cornelis A.W. Glas,et al. Computerized adaptive testing : theory and practice , 2000 .

[35] R. J. De Ayala. The nominal response model in computerized adaptive testing , 1992 .

[36] Robert J. Jannarone,et al. Conjunctive item response theory kernels , 1986 .