论文信息 - A COMPARISON OF THE RELIABILITY AND VALIDITY OF TWO METHODS FOR ASSESSING PARTIAL KNOWLEDGE ON A MULTIPLE-CHOICE TEST

A COMPARISON OF THE RELIABILITY AND VALIDITY OF TWO METHODS FOR ASSESSING PARTIAL KNOWLEDGE ON A MULTIPLE-CHOICE TEST

under one of three procedures. Results from those students administered the test under conventional directions provided a baseline for comparing, in terms of reliability and validity, the results from students who took the test under the differential weighting of response alternatives or the confidence testing instructions. Reliability was estimated by the split-half technique. Validity was estimated by correlating midterm test scores with scores on a final examination. This investigation provides some support for the contention that validity can be improved using more sophisticated testing techniques. Suggestions for the conduct of more definitive studies were offered. Tests presented in multiple-choice format pose a continuing problem for testers. The problem arises in connection with incorrect or omitted responses and consists of obtaining more information about an examinee's knowledge than that he could not derive the correct answer. Studies by Dressel and Schmid (1953) and Coombs, Milholland and Womer (1956) suggest that partial knowledge, that is, knowledge inadequate to produce a correct answer, is a measurable characteristic. Two kinds of procedures are typically employed, those that differentially weight the response alternatives and those that require examinees to report their confidence in the correctness of the response alternatives. Differential Weighting of Response Alternatives Differential weighting procedures deviate from the conventional way of scoring multiple-choice tests. Instead of scoring 1 for correct answers and 0 for incorrect and omitted responses, differential scoring weights are assigned to each response alternative to an item. A test score then consists of the sum of the weights of the response alternatives, both correct and incorrect, that the examinee chose in answering the test.

Ronald K. Hambleton | Ross E. Traub | Dennis M. Roberts

[1] E. H. Shuford,et al. Admissible probability measurement procedures , 1966, Psychometrika.

[2] L. Nedelsky. Ability to Avoid Gross Error as a Measure of Achievment , 1954 .

[3] M. R. Novick,et al. Statistical Theories of Mental Test Scores. , 1971 .

[4] J. Raven. Guide to using the Coloured Progressive Matrices. , 1958 .

[5] C. Dean Miller,et al. Scoring, Analyzing, and Reporting Classroom Tests Using an Optical Reader and 1401 Computer , 1967 .

[6] B. deFinetti,et al. METHODS FOR DISCRIMINATING LEVELS OF PARTIAL KNOWLEDGE CONCERNING A TEST ITEM. , 1965, The British journal of mathematical and statistical psychology.

[7] John Schmid,et al. Some Modifications of the Multiple-Choice Item , 1953 .

[8] I. M. Schlesinger,et al. Systematic Construction of Distractors for Ability and Achievement Test Items , 1967 .

[9] Joan J. Michael. THE RELIABILITY OF A MULTIPLE-CHOICE EXAMINATION UNDER VARIOUS TEST-TAKING INSTRUCTIONS1 , 1968 .

[10] INFORMATION IN WRONG RESPONSES1 , 1968 .

[11] Marilyn D. Wang,et al. Differential Weighting: A Survey of Methods and Empirical Studies. , 1968 .

[12] Paul I. Jacobs,et al. Information in Wrong Responses , 1970 .

[13] Clyde H. Coombs,et al. The Assessment of Partial Knowledge1 , 1956 .