论文信息 - The Effect of Test Length and IRT Model on the Distribution and Stability of Three Appropriateness Indexes

The Effect of Test Length and IRT Model on the Distribution and Stability of Three Appropriateness Indexes

The extent to which three appropriateness indexes - Z 3 , ECIZ4, and W (a variation of Wright's person-fit statistic) - are well-standardized was investigated in a monte carlo study. To assess the effects of the item response theory (IRT) model and test length on the distribution of the indexes and their cutoff values at three false positive rates, nonaberrant response patterns were generated. ECIZ4 most closely approximated a normal distribution, showing less skewness and kurtosis than Z 3 , and W. The ECIZ4 cutoff values were affected less by test length and the IRT model than were Z 3 , and W. In contrast, the distribution of W was the least stable over replications, and its cutoff values varied greatly depending on the IRT model and test length

[1] Donald B. Rubin,et al. Measuring the Appropriateness of Multiple-Choice Test Scores , 1979 .

[2] Lawrence M. Rudner. INDIVIDUAL ASSESSMENT ACCURACY , 1983 .

[3] Fritz Drasgow,et al. Appropriateness measurement with polychotomous item response models and standardized indices , 1984 .

[4] Kikumi K. Tatsuoka,et al. Indices for Detecting Unusual Patterns: Links Between Two General Approaches and Potential Applications , 1983 .

[5] Fritz Drasgow,et al. Choice of Test Model for Appropriateness Measurement , 1982 .

[6] K. Tatsuoka,et al. Standardized Extended Caution Indices and Comparisons of their Rule Detection Rates. , 1982 .

[7] Steven P. Reise,et al. A Comparison of Item- and Person-Fit Methods of Assessing Model-Data Fit in IRT , 1990 .

[8] Fritz Drasgow,et al. Optimal Detection of Certain Forms of Inappropriate Test Scores , 1986 .

[9] Michael V. LeVine,et al. Appropriateness measurement: Review, critique and validating studies , 1982 .

[10] Kikumi K. Tatsuoka,et al. Caution indices based on item response theory , 1984 .

[11] Fritz Drasgow,et al. Detecting Inappropriate Test Scores with Optimal and Practical Appropriateness Indices , 1987 .

[12] R. Hambleton,et al. Item Response Theory , 1984, The History of Educational Measurement.

[13] Benjamin D. Wright,et al. Solving measurement problems with the Rasch model. , 1977 .

[14] Richard M. Smith. Person Fit in the Rasch Model , 1986 .