论文信息 - ASSESSING UNUSUAL AGREEMENT BETWEEN THE INCORRECT ANSWERS OF TWO EXAMINEES USING THE K‐INDEX: STATISTICAL THEORY AND EMPIRICAL SUPPORT

ASSESSING UNUSUAL AGREEMENT BETWEEN THE INCORRECT ANSWERS OF TWO EXAMINEES USING THE K‐INDEX: STATISTICAL THEORY AND EMPIRICAL SUPPORT

Test security and other concerns can lead to an interest in the assessment of how unusual it is for the answers of two different examinees to agree as much as they do. At Educational Testing Service, a measure called the K-index is used to assess ‘unusual agreement’ between the incorrect answers of two examinees on a multiple-choice test. Here, I describe the K-index and give the results of an empirical study of some of the assumptions that underlie it and its use. The results of this study show that the K-index can be expected to give a conservative estimate of the probability of chance agreement in the typical situations for which it is used, and that several important assumptions underlying the K-index are supported by relevant data. In addition, the results presented here suggest a minor modification of the current (as of 1993) application of K-index to part of the SAT to better insure that it is a conservative measure of chance agreement.

Paul W. Holland | P. Holland

[1] W. Hoeffding. On the Distribution of the Number of Successes in Independent Trials , 1956 .

[2] Robert B. Frary,et al. Statistical Detection of Multiple-Choice Answer Copying: Review and Commentary , 1993 .

[3] T. Nicolaus Tideman,et al. Indices of Cheating on Multiple-Choice Tests , 1977 .

[4] Melvin R. Novick,et al. The Detection of Cheating on Standardized Tests: Statistical and Legal Analysis , 1980 .

[5] Donald B. Rubin,et al. Measuring the Appropriateness of Multiple-Choice Test Scores , 1979 .

[6] Stephen E. Fienberg. The Evolving Role of Statistical Assessments as Evidence in the Courts , 1989 .

[7] William H. Angoff,et al. The Development of Statistical Indices for Detecting Cheaters , 1974 .

[8] Francis S. Bellezza,et al. Detection of Cheating on Multiple-Choice Tests by Using Error-Similarity Analysis , 1989 .