论文信息 - Divide and correct: using clusters to grade short answers at scale - 字舞流文

Divide and correct: using clusters to grade short answers at scale

In comparison to multiple choice or other recognition-oriented forms of assessment, short answer questions have been shown to offer greater value for both students and teachers; for students they can improve retention of knowledge, while for teachers they provide more insight into student understanding. Unfortunately, the same open-ended nature which makes them so valuable also makes them more difficult to grade at scale. To address this, we propose a cluster-based interface that allows teachers to read, grade, and provide feedback on large groups of answers at once. We evaluated this interface against an unclustered baseline in a within-subjects study with 25 teachers, and found that the clustered interface allows teachers to grade substantially faster, to give more feedback to students, and to develop a high-level view of students' understanding and misconceptions.

Sumit Basu | Lucy Vanderwende | Michael Brooks | Charles Jacobs | Lucy Vanderwende | S. Basu | Michael Brooks | Charles Jacobs

[1] A. Poulos,et al. Effectiveness of feedback: the students’ perspective , 2008 .

[2] Ben Hamner,et al. Contrasting state-of-the-art automated scoring of essays: analysis , 2012 .

[3] Sally E. Jordan,et al. e-Assessment for learning? The potential of short-answer free-text questions with tailored feedback , 2009, Br. J. Educ. Technol..

[4] Mary Thorpe,et al. Assessment and ‘third generation’ distance education , 1998 .

[5] Lydia B. Chilton,et al. Personalized Online Education - A Crowdsourcing Challenge , 2012, HCOMP@AAAI.

[6] M. Scriven. The methodology of evaluation , 1966 .

[7] B. Bloom. The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring , 1984 .

[8] P. Sadler,et al. The Impact of Self- and Peer-Grading on Student Learning , 2006 .

[9] Jeffrey D. Karpicke,et al. The Critical Importance of Retrieval for Learning , 2008, Science.

[10] Marti A. Hearst. The debate on automated essay grading , 2000 .

[11] Rada Mihalcea,et al. Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments , 2011, ACL.

[12] Richard C. Anderson,et al. On asking people questions about what they are reading , 1975 .

[13] J. H. McMillan. Secondary Teachers' Classroom Assessment and Grading Practices , 2005 .

[14] Sumit Basu,et al. Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading , 2013, TACL.

[15] Steve Myran,et al. Elementary Teachers' Classroom Assessment and Grading Practices , 2002 .

[16] John Heywood,et al. Assessment in higher education , 1978 .

[17] John Heywood,et al. Assessment in Higher Education: Student Learning, Teaching, Programmes and Institutions , 1977 .

[18] J. Michael Spector,et al. Handbook of Research on Educational Communications and Technology, 3rd Edition , 2012 .

[19] E. Mory. Feedback research revisited. , 2004 .

[20] Robert B. Frary,et al. Hodgepodge Grading: Endorsed by Students and Teachers Alike. , 1999 .

[21] John L. Esposito,et al. Practice and Theory , 2004 .

[22] Zhenghao Chen,et al. Tuned Models of Peer Assessment in MOOCs , 2013, EDM.

[23] Loren G. Terveen,et al. Two peers are better than one: aggregating peer reviews for computing assignments is surprisingly accurate , 2009, GROUP.

[24] Susan M. Brookhart,et al. Teachers' Grading: Practice and Theory , 1994 .