Solving Measurement Problems with an Answer-Until-Correct Scoring Procedure

Answer-until-correct (AUC) tests have been in use for some time. Pressey (1950) pointed to their ad vantages in enhancing learning, and Brown (1965) proposed a scoring procedure for AUC tests that appears to increase reliability (Gilman & Ferry, 1972; Hanna, 1975). This paper describes a new scoring procedure for AUC tests that (1) makes it possible to determine whether guessing is at ran dom, (2) gives a measure of how "far away" guess ing is from being random, (3) corrects observed test scores for partial information, and (4) yields a mea sure of how well an item reveals whether an ex aminee knows or does not know the correct re sponse. In addition, the paper derives the optimal linear estimate (under squared-error loss) of true score that is corrected for partial information, as well as another formula score under the assumption that the Dirichlet-multinomial model holds. Once certain parameters are estimated, the latter formula score makes it possible to correct for partial infor mation using only the examinee's usual number- correct observed score. The importance of this for mula score is discussed. Finally, various statistical techniques are described that can be used to check the assumptions underlying the proposed scoring procedure.

[1]  Clyde H. Coombs,et al.  The Assessment of Partial Knowledge1 , 1956 .

[2]  F. Lord A strong true-score theory, with applications. , 1965, Psychometrika.

[3]  An Index for a Domain of Completion or Short Answer Items , 1978 .

[4]  Michael J. Subkoviak,et al.  Empirical Investigation of Procedures for Estimating Reliability for Mastery Tests. , 1978 .

[5]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[6]  Cyril Burt,et al.  Fundamentals of Statistics. , 1948 .

[7]  Donald G. Morrison,et al.  A modified beta binomial model with applications to multiple choice and taste tests , 1979 .

[8]  R. Wilcox Determining the Length of a Criterion-Referenced Test , 1980 .

[9]  J. Brown,et al.  MULTIPLE RESPONSE EVALUATION OF DISCRIMINATION. , 1965, The British journal of mathematical and statistical psychology.

[10]  J. Kalbfleisch Statistical Inference Under Order Restrictions , 1975 .

[11]  R. Wilcox Estimating true score in the compound binomial error model , 1978 .

[12]  Rand R. Wilcox,et al.  A Review of the Beta-Binomial Model and its Extensions , 1981 .

[13]  A. M. Carr-Saunders,et al.  Wealth and Welfare , 1913 .

[14]  Ingram Olkin,et al.  A subset selection technique for scoring items on a multiple choice test , 1979 .

[15]  C. Mitchell Dayton,et al.  The Use of Probabilistic Models in the Assessment of Mastery , 1977 .

[16]  J. Mosimann On the compound multinomial distribution, the multivariate β-distribution, and correlations among proportions , 1962 .

[17]  B. Griffin,et al.  Optimal linear estimators: an empirical Bayes version with application to the binomial distribution , 1971 .

[18]  Huynh Huynh,et al.  Statistical consideration of mastery scores , 1976 .

[19]  G. S. Hanna INCREMENTAL RELIABILITY AND VALIDITY OF MULTIPLE-CHOICE TESTS WITH AN ANSWER-UNTIL-CORRECT PROCEDURE1 , 1975 .

[20]  Rand R. Wilcox Achievement tests and latent structure models , 1979 .

[21]  R. Weitzman Ideal Multiple-choice Items , 1970 .

[22]  Tim Robertson,et al.  Testing for and against an Order Restriction on Multinomial Parameters , 1978 .

[23]  M. O. Lorenz,et al.  Methods of Measuring the Concentration of Wealth , 1905, Publications of the American Statistical Association.

[24]  D. Gilman,et al.  INCREASING TEST RELIABILITY THROUGH SELF-SCORING PROCEDURES , 1972 .

[25]  R. Wilcox The Single Administration Estimate of the Proportion of Agreement of a Proficiency Test Scored with a Latent Structure Model , 1981 .

[26]  P. W. Zehna Invariance of Maximum Likelihood Estimators , 1966 .

[27]  George B. Macready,et al.  A probabilistic model for validation of behavioral hierarchies , 1976 .

[28]  L. Hubert,et al.  Inference Procedures for Ordering Theory , 1977 .

[29]  R. Wilcox Analyzing the Distractors of Multiple-Choice Test Items or Partitioning Multinomial Cell Probabilities with Respect to a Standard1 , 1981 .

[30]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[31]  R. Wilcox Estimating the Likelihood of False-Positive and False-Negative Decisions in Mastery Testing: An Empirical Bayes Approach , 1977 .

[32]  Arthur Cecil Pigou,et al.  Wealth and Welfare. , 1913 .

[33]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[34]  Alan R. Hartke THE USE OF LATENT PARTITION ANALYSIS TO IDENTIFY HOMOGENEITY OF AN ITEM POPULATION , 1978 .

[35]  James D. Laing,et al.  Prediction Analysis of Cross Classifications. , 1976 .

[36]  I. Olkin,et al.  Selecting and Ordering Populations: A New Statistical Methodology , 1977 .

[37]  G. Duncan An Empirical Bayes Approach to Scoring Multiple-Choice Tests in the Misinformation Model , 1974 .

[38]  Michael J. Noe,et al.  A STUDY OF THE ACCURACY OF SUBKOVIAK'S SINGLE‐ADMINISTRATION ESTIMATE OF THE COEFFICIENT OF AGREEMENT USING TWO TRUE‐SCORE ESTIMATES , 1978 .

[39]  H. Dalton The Measurement of the Inequality of Incomes , 1920 .

[40]  R. Frary,et al.  AN EMPIRICAL TEST OF LORD'S THEORETICAL RESULTS REGARDING FORMULA SCORING OF MULTIPLE‐CHOICE TESTS , 1977 .

[41]  Joseph M. Scandura,et al.  Deterministic and probabilistic theorizing in structural learning , 1977 .

[42]  Ronald K. Hambleton,et al.  Criterion-Referenced Testing and Measurement: A Review of Technical Issues and Developments , 1978 .

[43]  S. Pressey Development and Appraisal of Devices Providing Immediate Automatic Scoring of Objective Tests and Concomitant Self-Instruction , 1950 .

[44]  Paul Horst,et al.  The difficulty of a multiple choice test item. , 1933 .

[45]  R. Frary The Effect of Misinformation, Partial Information, and Guessing on Expected Multiple-Choice Test Item Scores , 1980 .

[46]  L. Bliss,et al.  A Test of Lord's Assumption Regarding Examinee Guessing Behavior on Multiple-Choice Tests Using Elementary School Students. , 1980 .

[47]  Frederic M. Lord,et al.  A theoretical distribution for mental test scores , 1962 .

[48]  R. F.,et al.  Mathematical Statistics , 1944, Nature.