Reliability of a computerized scoring routine for an open-ended task

Abstract This article asserts the value of open-ended responses for CALL lessons and language tests. Results are presented from a study in which students' notes and recall protocols of computerized reading passages were scored by both people and a computer program. The reliability of human scores was calculated using coefficient alpha; the reliability of human and computer scores was computed using Pearson's product-moment correlation coefficient. Results indicated that the computer program scored reliably with the people, and in much less time.

[1]  K. Tatsuoka,et al.  Open-Ended Versus Multiple-Choice Response Formats—It Does Make a Difference for Diagnostic Purposes , 1987 .

[2]  Elizabeth B. Bernhardt Testing Foreign Language Reading Comprehension: The Immediate Recall Protocol. , 1983 .

[3]  Differential Effects of Note Taking Ability and Lecture Encoding Structure on Student Learning. , 1984 .

[4]  K. M. Pederson An Experiment in Computer-Assisted Second-Language Reading , 1986 .

[5]  Kenneth A. Kiewra,et al.  Cognitive Style: Effects of Structure at Acquisition and Testing. , 1986 .

[6]  A. Hughes Testing for language teachers , 1989 .

[7]  Walter Kintsch,et al.  Toward a model of text comprehension and production. , 1978 .

[8]  Lyle F. Bachman 语言测试要略 = Fundamental considerations in language testing , 1990 .

[9]  James F. Lee On the Use of the Recall Task to Measure L2 Reading Comprehension , 1986, Studies in Second Language Acquisition.

[10]  Norman Frederiksen,et al.  Construct Validity of Free-Response and Machine-Scorable Forms of a Test. , 1980 .

[11]  Randy Elliot Bennett,et al.  TOWARD INTELLIGENT ASSESSMENT: AN INTEGRATION OF CONSTRUCTED RESPONSE TESTING, ARTIFICIAL INTELLIGENCE, AND MODEL‐BASED MEASUREMENT , 1990 .

[12]  N. E. Gronlund Measurement and evaluation in teaching , 1965 .

[13]  Ulla Connor,et al.  Recall of Text: Differences between First and Second Language Readers. , 1984 .

[14]  P. Carrell The Effects of Rhetorical Organization on ESL Readers. , 1984 .

[15]  Kenneth A. Kiewra,et al.  Providing study notes: Comparison of three types of notes for review. , 1988 .

[16]  Marilyn M. Hicks THE TOEFL COMPUTERIZED PLACEMENT TEST: ADAPTIVE CONVENTIONAL MEASUREMENT , 1989 .

[17]  N. Frederiksen The real test bias: Influences of testing on teaching and learning. , 1984 .

[18]  Andrew D. Cohen Testing Linguistic and Communicative Proficiency: The Case of Reading Comprehension. , 1987 .

[19]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[20]  Patricia Dunkel,et al.  The Content of L1 and L2 Students' Lecture Notes and Its Relation to Test Performance. , 1988 .

[21]  John R. Anderson Cognitive Psychology and Its Implications , 1980 .

[22]  Gordon A. Hale,et al.  Note Taking and Listening Comprehension on the Test of English as a Foreign Language. Research Report 34. , 1991 .

[23]  Patricia L. Carrell,et al.  THREE COMPONENTS OF BACKGROUND KNOWLEDGE IN READING COMPREHENSION , 1983 .

[24]  Randy Elliot Bennett,et al.  TOWARD A FRAMEWORK FOR CONSTRUCTED‐RESPONSE ITEMS , 1990 .

[25]  Elizabeth B. Bernhardt,et al.  The Teaching and Testing of Comprehension in Foreign Language Learning. , 1987 .

[26]  E. Schmidt,et al.  Lex—a lexical analyzer generator , 1990 .