论文信息 - Detecting Inappropriate Test Scores with Optimal and Practical Appropriateness Indices

Detecting Inappropriate Test Scores with Optimal and Practical Appropriateness Indices

Several statistics have been proposed as quantitative indices of the appropriateness of a test score as a mea sure of ability. Two criteria have been used to evalu ate such indices in previous research. The first crite rion, standardization, refers to the extent to which the conditional distributions of an index, given ability, are invariant across ability levels. The second criterion, relative power, refers to indices' relative effectiveness for detecting inappropriate test scores. In this paper the effectiveness of nine appropriateness indices is de termined in an absolute sense by comparing them to optimal indices; an optimal index is the most powerful index for a particular form of aberrance that can be computed from item responses. Three indices were found to provide nearly optimal rates of detection of very low ability response patterns modified to simulate cheating, as well as very high ability response patterns modified to simulate spuriously low responding. Opti mal indices had detection rates from 50% to 200% higher than any other index when average ability re sponse vectors were manipulated to appear spuriously high and spuriously low.

[1] Fritz Drasgow,et al. Optimal Detection of Certain Forms of Inappropriate Test Scores , 1986 .

[2] F. Drasgow,et al. Performance Envelopes and Optimal Appropriateness Measurement. , 1984 .

[3] Fritz Drasgow,et al. Appropriateness measurement with polychotomous item response models and standardized indices , 1984 .

[4] Kikumi K. Tatsuoka,et al. Caution indices based on item response theory , 1984 .

[5] Lawrence M. Rudner. INDIVIDUAL ASSESSMENT ACCURACY , 1983 .

[6] Fritz Drasgow,et al. The Relation between Incorrect Option Choice and Estimated Ability , 1983 .

[7] Kikumi K. Tatsuoka,et al. Indices for Detecting Unusual Patterns: Links Between Two General Approaches and Potential Applications , 1983 .

[8] Michael V. LeVine,et al. Appropriateness measurement: Review, critique and validating studies , 1982 .

[9] F. Lord. Applications of Item Response Theory To Practical Testing Problems , 1980 .

[10] Norman Cliff,et al. Test theory without true scores? , 1979 .

[11] B. Efron,et al. Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information , 1978 .

[12] Benjamin D. Wright,et al. Solving measurement problems with the Rasch model. , 1977 .

[13] F. Massey. The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[14] Fritz Drasgow,et al. Item response theory : application to psychological measurement , 1983 .