Peer and self assessment in massive online classes

Peer and self-assessment offer an opportunity to scale both assessment and learning to global classrooms. This article reports our experiences with two iterations of the first large online class to use peer and self-assessment. In this class, peer grades correlated highly with staff-assigned grades. The second iteration had 42.9% of students’ grades within 5% of the staff grade, and 65.5% within 10%. On average, students assessed their work 7% higher than staff did. Students also rated peers’ work from their own country 3.6% higher than those from elsewhere. We performed three experiments to improve grading accuracy. We found that giving students feedback about their grading bias increased subsequent accuracy. We introduce short, customizable feedback snippets that cover common issues with assignments, providing students more qualitative peer feedback. Finally, we introduce a data-driven approach that highlights high-variance items for improvement. We find that rubrics that use a parallel sentence structure, unambiguous wording, and well-specified dimensions have lower variance. After revising rubrics, median grading error decreased from 12.4% to 9.9%.

[1]  Peng Dai,et al.  Decision-Theoretic Control of Crowd-Sourced Workflows , 2010, AAAI.

[2]  Bill Buxton,et al.  Sketching User Experiences: Getting the Design Right and the Right Design , 2007 .

[3]  Alice M. Agogino,et al.  Scaffolding knowledge integration through designing multimedia case studies of engineering design , 1995, Proceedings Frontiers in Education 1995 25th Annual Conference. Engineering Education for the 21st Century.

[4]  Wai-Tat Fu,et al.  Enhancing reliability using peer consistency evaluation in human computation , 2013, CSCW '13.

[5]  Michael S. Bernstein,et al.  The future of crowd work , 2013, CSCW.

[6]  Jakob Nielsen,et al.  Enhancing the explanatory power of usability heuristics , 1994, CHI '94.

[7]  John R. Anderson,et al.  RECOGNITION AND RETRIEVAL PROCESSES IN FREE RECALL , 1972 .

[8]  P. Pintrich Understanding self‐regulated learning , 1995 .

[9]  Edmund Burke Feldman,et al.  Practical Art Criticism , 1994 .

[10]  Randy Elliot Bennett,et al.  Validity and Automad Scoring: It's Not Only the Scoring , 1998 .

[11]  Mark Guzdial,et al.  Effective Discussion Through a Computer-Mediated Anchored Forum , 2000 .

[12]  Richard D. Harvey,et al.  Critical Thinking in Critical Courses: Principles and Applications , 2009 .

[13]  D. Boud Sustainable Assessment: Rethinking assessment for the learning society , 2000 .

[14]  Marcia B. Baxter Magolda Intellectual Development in the College Years , 2006 .

[15]  Abigail Sellen,et al.  Getting the right design and the design right , 2006, CHI.

[16]  J J Veloski,et al.  Patients don't present with five choices: an alternative to multiple-choice tests in assessing physicians' competence. , 1999, Academic medicine : journal of the Association of American Medical Colleges.

[17]  Sarah A. Douglas,et al.  Teaching HCI Design With the Studio Approach , 2003, Comput. Sci. Educ..

[18]  David E. Pritchard,et al.  Studying Learning in the Worldwide Classroom Research into edX's First MOOC. , 2013 .

[19]  Jody Oomen-Early,et al.  Personalized Versus Collective Instructor Feedback in the Online Courseroom: Does Type of Feedback , 2008 .

[20]  Loren Olson,et al.  CritViz: Web-Based Software Supporting Peer Critique in Large Creative Classrooms , 2013 .

[21]  Jodi Forlizzi,et al.  Understanding experience in interactive systems , 2004, DIS '04.

[22]  H. Andrade Teaching With Rubrics: The Good, the Bad, and the Ugly , 2005 .

[23]  K. Topping Peer Assessment Between Students in Colleges and Universities , 1998 .

[24]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[25]  Deanna P. Dannels,et al.  Critiquing Critiques , 2008 .

[26]  N. Falchikov,et al.  Student Peer Assessment in Higher Education: A Meta-Analysis Comparing Peer and Teacher Marks , 2000 .

[27]  Paula Kotzé,et al.  Creativity and HCI: From Experience to Design in Education, Selected Contributions from HCIEd 2007, The Second International Working Conference of Human-Computer Interaction Educators, March 29-30, 2007, Aveiro, Portugal , 2009, HCIEd.

[28]  P. Pintrich,et al.  Student Motivation and Self-Regulated Learning in the College Classroom , 2002 .

[29]  Dana S. Dunn,et al.  Teaching critical thinking in psychology : a handbook of best practices , 2008 .

[30]  Peter W. Foltz,et al.  The Debate on Automated Essay Grading , 2000, IEEE Intell. Syst..

[31]  James D. Herbsleb,et al.  Impression formation in online peer production: activity traces and personal profiles in github , 2013, CSCW.

[32]  A. Faris,et al.  The Impact of Homogeneous vs. Heterogeneous Collaborative Learning Groups in Multicultural Classes on the Achievement and Attitudes of Nine Graders towards Learning Science. , 2009 .

[33]  Anoop Gupta,et al.  Distance learning through distributed collaborative video viewing , 2000, CSCW '00.

[34]  Elizabeth Ligon Bjork,et al.  Pretesting with Multiple-choice Questions Facilitates Learning , 2011, CogSci.

[35]  B. Zimmerman,et al.  Self-regulated learning and academic achievement : theoretical perspectives , 2001 .

[36]  Donald Bitzer,et al.  The Wide World of Computer-Based Education , 1976, Adv. Comput..

[37]  Randy Elliot Bennett,et al.  VALIDITY AND AUTOMATED SCORING: IT'S NOT ONLY THE SCORING , 1997 .

[38]  Scott R. Klemmer,et al.  How bodies matter: five themes for interaction design , 2006, DIS '06.

[39]  T. M. Amabile Social psychology of creativity: A consensual assessment technique. , 1982 .

[40]  Terry Winograd,et al.  What can we teach about human-computer interaction? (plenary address) , 1990, CHI '90.

[41]  Z. Popovic,et al.  Crystal structure of a monomeric retroviral protease solved by protein folding game players , 2011, Nature Structural &Molecular Biology.

[42]  Arlene A. Russell,et al.  Web-Based Student Writing and Reviewing in a Large Biology Lecture Course , 2007 .

[43]  Daniel Fallman,et al.  Design-oriented human-computer interaction , 2003, CHI '03.

[44]  Sarah A. Douglas,et al.  Promoting creativity in the computer science design studio , 2011, SIGCSE '11.

[45]  Scott R. Klemmer,et al.  Shepherding the crowd yields better work , 2012, CSCW.

[46]  John Baer,et al.  A Comparison of Expert and Nonexpert Raters Using the Consensual Assessment Technique , 2008 .

[47]  D. Gentner,et al.  Splitting the Differences: A Structural Alignment View of Similarity , 1993 .

[48]  Aditya G. Parameswaran,et al.  So who won?: dynamic max discovery with the crowd , 2012, SIGMOD Conference.

[49]  Nina Mazar,et al.  The Dishonesty of Honest People: A Theory of Self-Concept Maintenance , 2008 .

[50]  Bryan Lawson,et al.  How Designers Think: The Design Process Demystified , 1990 .

[51]  P. Murtaugh,et al.  PREDICTING THE RETENTION OF UNIVERSITY STUDENTS , 1999 .

[52]  Adam D. Galinsky,et al.  Counterfactuals as behavioral primes: Priming the simulation heuristic and consideration of alternatives. , 2000 .

[53]  Randy Elliot Bennett,et al.  Evaluating an Automatically Scorable, Open-Ended Response Type for Measuring Mathematical Reasoning in Computer-Adaptive Tests. , 1997 .

[54]  David A. Patterson,et al.  Crossing the software education chasm , 2012, Commun. ACM.

[55]  École nationale supérieure des beaux-arts,et al.  The Architecture of the Ecole Des Beaux-Arts , 1984 .

[56]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[57]  Ed Huai-hsin Chi A Position Paper on 'Living Laboratories': Rethinking Ecological Designs and Experimentation in Human-Computer Interaction , 2009, HCI.

[58]  B. Zimmerman,et al.  Reflections on Theories of Self-Regulated Learning and Academic Achievement , 2013 .

[59]  Jakob Nielsen,et al.  Iterative user-interface design , 1993, Computer.

[60]  Chris Callison-Burch,et al.  Crowdsourcing Translation: Professional Quality from Non-Professionals , 2011, ACL.

[61]  Douglas Neale,et al.  Assessment focus in studio: What is most prominent in architecture, art and design? , 2009 .

[62]  David Boud,et al.  Enhancing learning through self assessment , 1995 .

[63]  Donald D. Chinn Peer assessment in the algorithms course , 2005, ITiCSE '05.

[64]  Fred G. Martin,et al.  Will massive open online courses change how we teach? , 2012, CACM.

[65]  Michael Vitale,et al.  The Wisdom of Crowds , 2015, Cell.

[66]  K. Gegenfurtner,et al.  Design Issues in Gaze Guidance Under review with ACM Transactions on Computer Human Interaction , 2009 .

[67]  Scott R. Klemmer,et al.  Exiting the Cleanroom: On Ecological Validity and Ubiquitous Computing , 2008, Hum. Comput. Interact..

[68]  James E. Tomayko,et al.  Teaching software development in a studio environment , 1991, SIGCSE '91.

[69]  Kerri L. Johnson,et al.  Why the Unskilled are Unaware: Further Explorations of (Absent) Self-Insight Among the Incompetent , 2006, Organizational behavior and human decision processes.

[70]  D. Nicol,et al.  Formative assessment and self‐regulated learning: a model and seven principles of good feedback practice , 2006 .

[71]  Daniel L. Schwartz,et al.  Parallel prototyping leads to better design results, more divergence, and increased self-efficacy , 2010, TCHI.

[72]  Marti A. Hearst The debate on automated essay grading , 2000 .

[73]  Belkis Uluoǧlu,et al.  Design knowledge communicated in studio critiques , 2000 .

[74]  Thomas A. Dutton,et al.  The Design Studio: An Exploration of its Traditions and Potential , 1989 .

[75]  P. A. Carlson,et al.  Calibrated peer review/sup TM/ and assessing learning outcomes , 2003, 33rd Annual Frontiers in Education, 2003. FIE 2003..

[76]  Anne Venables,et al.  Enhancing scientific essay writing using peer assessment , 2003 .

[77]  Coye Cheshire,et al.  The Social Psychological Effects of Feedback on the Production of Internet Information Pools , 2008, J. Comput. Mediat. Commun..

[78]  Teresa M. Amabile,et al.  Evidence to Support the Componential Model of Creativity: Secondary Analyses of Three Studies , 1996 .

[79]  Ben Ost The role of peers and grades in determining major persistence in the sciences , 2010 .

[80]  B. Zimmerman,et al.  Self-regulated learning and academic achievement: Theory, research, and practice. , 1989 .

[81]  Eric Roberts,et al.  Using undergraduates as teaching assistants in introductory programming courses: an update on the Stanford experience , 1995, SIGCSE.

[82]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[83]  Saul Greenberg Embedding a Design Studio Course in a Conventional Computer Science Program , 2007, HCIEd.

[84]  Chris Piech,et al.  Deconstructing disengagement: analyzing learner subpopulations in massive open online courses , 2013, LAK '13.

[85]  Adrian Snodgrass,et al.  Interpretation in Architecture: Design as Way of Thinking , 2005 .

[86]  Leysia Palen,et al.  Social, individual and technological issues for groupware calendar systems , 1999, CHI '99.

[87]  Christine A. Stanley,et al.  Engaging Large Classes: Strategies and Techniques for College Faculty , 2001 .

[88]  Raymond P. Perry,et al.  The Scholarship of Teaching and Learning in Higher Education: An Evidence-Based Perspective , 2007 .

[89]  Lauralee Alben,et al.  Quality of experience: defining the criteria for effective interaction design , 1996, INTR.

[90]  Paul S. Goodman,et al.  Technology Enhanced Learning: Opportunities For Change , 2001 .