Automated assessment of programming assignments : visual feedback, assignment mobility, and assessment of students' testing skills

Aalto University, P.O. Box 11000, FI-00076 Aalto www.aalto.fi Author Petri Ihantola Name of the doctoral dissertation Automated Assessment of Programming Assignments: Visual Feedback, Assignment Mobility, and Assessment of Students' Testing Skills Publisher Aalto University School of Science Unit Department of Computer Science and Engineering Series Aalto University publication series DOCTORAL DISSERTATIONS 131/2011 Field of research Software Systems Manuscript submitted 12 July 2011 Manuscript revised 24 October 2011 Date of the defence 9 December 2011 Language English Monograph Article dissertation (summary + original articles) Abstract The main objective of this thesis is to improve the automated assessment of programming assignments from the perspective of assessment tool developers. We have developed visual feedback on functionality of students' programs and explored methods to control the level of detail in visual feedback. We have found that visual feedback does not require major changes to existing assessment platforms. Most modern platforms are web based, creating an opportunity to describe visualizations in JavaScript and HTML embedded into textual feedback. Our preliminary results on the effectiveness of automatic visual feedback indicate that students perform equally well with visual and textual feedback. However, visual feedback based on automatically extracted object graphs can take less time to prepare than textual feedback of good quality. We have also developed programming assignments that are easier to port from one server environment to another by performing assessment on the client-side. This not only makes it easier to use the same assignments in different server environments but also removes the need for sandboxing the execution of students' programs. The approach will likely become more important in the future together with interactive study materials becoming more popular. Client-side assessment is more suitable for self-studying material than for grading because assessment results sent by a client are often too easy to falsify. Testing is an important part of programming and automated assessment should also cover students' self-written tests. We have analyzed how students behave when they are rewarded for structural test coverage (e.g. line coverage) and found that this can lead students to write tests with good coverage but with poor ability to detect faulty programs. Mutation analysis, where a large number of (faulty) programs are automatically derived from the program under test, turns out to be an effective way to detect tests otherwise fooling our assessment systems. Applying mutation analysis directly for grading is problematic because some of the derived programs are equivalent with the original and some assignments or solution strategies generate more equivalent mutants than others.The main objective of this thesis is to improve the automated assessment of programming assignments from the perspective of assessment tool developers. We have developed visual feedback on functionality of students' programs and explored methods to control the level of detail in visual feedback. We have found that visual feedback does not require major changes to existing assessment platforms. Most modern platforms are web based, creating an opportunity to describe visualizations in JavaScript and HTML embedded into textual feedback. Our preliminary results on the effectiveness of automatic visual feedback indicate that students perform equally well with visual and textual feedback. However, visual feedback based on automatically extracted object graphs can take less time to prepare than textual feedback of good quality. We have also developed programming assignments that are easier to port from one server environment to another by performing assessment on the client-side. This not only makes it easier to use the same assignments in different server environments but also removes the need for sandboxing the execution of students' programs. The approach will likely become more important in the future together with interactive study materials becoming more popular. Client-side assessment is more suitable for self-studying material than for grading because assessment results sent by a client are often too easy to falsify. Testing is an important part of programming and automated assessment should also cover students' self-written tests. We have analyzed how students behave when they are rewarded for structural test coverage (e.g. line coverage) and found that this can lead students to write tests with good coverage but with poor ability to detect faulty programs. Mutation analysis, where a large number of (faulty) programs are automatically derived from the program under test, turns out to be an effective way to detect tests otherwise fooling our assessment systems. Applying mutation analysis directly for grading is problematic because some of the derived programs are equivalent with the original and some assignments or solution strategies generate more equivalent mutants than others.

[1]  J. Stasko,et al.  A Meta-Study of Algorithm Visualization Effectiveness , 2002, J. Vis. Lang. Comput..

[2]  Eladio Gutiérrez,et al.  A new Moodle module supporting automatic verification of VHDL-based assignments , 2010, Comput. Educ..

[3]  Kirsti Ala-Mutka,et al.  Assessment process for programming assignments , 2004, IEEE International Conference on Advanced Learning Technologies, 2004. Proceedings..

[4]  F. Coffield Should We Be Using Learning Styles? What Research Has to Say to Practice , 2004 .

[5]  Lauri Malmi,et al.  TRAKLA2: a Framework for Automatically Assessed Visual Algorithm Simulation Exercises , 2003 .

[6]  Mary Lou Soffa,et al.  Coverage criteria for GUI testing , 2001, ESEC/FSE-9.

[7]  Hong Zhu,et al.  Software unit test coverage and adequacy , 1997, ACM Comput. Surv..

[8]  R. Tous,et al.  Work in progress-improving feedback using an automatic assessment tool , 2008, 2008 38th Annual Frontiers in Education Conference.

[9]  Kirsti Ala-Mutka,et al.  A Survey of Automated Assessment Approaches for Programming Assignments , 2005, Comput. Sci. Educ..

[10]  D. T. Lee,et al.  Design and applications of an algorithm benchmark system in a computational problem solving environment , 2006, ITICSE '06.

[11]  Ari Korhonen,et al.  Survey of Effortlessness in Algorithm Visualization Systems , 2004 .

[12]  Stephen H. Edwards Rethinking computer science education from a test-first perspective , 2003, OOPSLA '03.

[13]  Neil Brown,et al.  Greenroom: a teacher community for collaborative resource development , 2010, ITiCSE '10.

[14]  Ursula Fuller,et al.  Developing a computer science-specific learning taxonomy , 2007, ACM SIGCSE Bull..

[15]  Xiaohong Su,et al.  Semantic similarity-based grading of student programs , 2007, Inf. Softw. Technol..

[16]  M. McDaniel,et al.  Learning Styles , 2008, Psychological science in the public interest : a journal of the American Psychological Society.

[17]  Michael Luck,et al.  Effective electronic marking for on-line assessment , 1998, ITiCSE '98.

[18]  Tarek Hegazy,et al.  The CourseMarker CBA System: Improvements over Ceilidh , 2004, Education and Information Technologies.

[19]  Steve Benford,et al.  Ceilidh as a Course Management Support System , 1994 .

[20]  John Leaney,et al.  Introductory programming, criterion-referencing, and bloom , 2003, SIGCSE.

[21]  Stephen H. Edwards,et al.  Developing a common format for sharing programming assignments , 2008, ACM SIGCSE Bull..

[22]  Dietmar F. Rösner,et al.  EduComponents: experiences in e-assessment in computer science education , 2006, ITICSE '06.

[23]  Erkki Sutinen,et al.  Extending the Engagement Taxonomy: Software Visualization and Collaborative Learning , 2009, TOCE.

[24]  Blaine A. Price,et al.  A Principled Taxonomy of Software Visualization , 1993, J. Vis. Lang. Comput..

[25]  Mordechai Ben-Ari,et al.  The Jeliot 2000 program animation system , 2003, Comput. Educ..

[26]  Tony Greening,et al.  Emerging Constructivist Forces in Computer Science Education: Shaping a New Future? , 2000 .

[27]  Stephen H. Edwards,et al.  Improving student performance by evaluating how well students test their own programs , 2003, JERC.

[28]  R. Felder,et al.  Learning and Teaching Styles in Engineering Education. , 1988 .

[29]  Bernd Freisleben,et al.  ANIMAL: A System for Supporting Multiple Roles in Algorithm Animation , 2002, J. Vis. Lang. Comput..

[30]  Petri Ihantola,et al.  Initial Set of Services for Algorithm Visualization , 2011 .

[31]  Ursula Fuller,et al.  Is Bloom's taxonomy appropriate for computer science? , 2006, Baltic Sea '06.

[32]  Michael T. Goodrich,et al.  PILOT: an interactive tool for learning and grading , 2000, SIGCSE '00.

[33]  Lauri Malmi,et al.  Algorithm recognition by static analysis and its application in students' submissions assessment , 2008 .

[34]  James R. Eagan,et al.  JHAVÉ—an environment to actively engage students in Web-based algorithm visualizations , 2000, SIGCSE '00.

[35]  Mordechai Ben-Ari,et al.  Learning computer science concepts with scratch , 2010, ICER '10.

[36]  Esa Vihtonen VIOPE – Computer Supported Environment for Learning Programming Languages , 2002 .

[37]  Peter Forbrig,et al.  Towards generic and flexible web services for e-assessment , 2008, ITiCSE.

[38]  Christopher Douce,et al.  Automatic test-based assessment of programming: A review , 2005, JERC.

[39]  Tommi Mikkonen,et al.  Web Browser as an Application Platform , 2008, 2008 34th Euromicro Conference Software Engineering and Advanced Applications.

[40]  Erkki Sutinen,et al.  Visualizing programs with Jeliot 3 , 2004, AVI.

[41]  Thomas L. Naps,et al.  Exploring the role of visualization and engagement in computer science education , 2003, ITiCSE-WGR '02.

[42]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[43]  D. Kolb Experiential Learning: Experience as the Source of Learning and Development , 1983 .

[44]  Françoise Détienne,et al.  Expert Programming Knowledge: a Schema-Based Approach , 2007, ArXiv.

[45]  Michal Forǐsek,et al.  Security of Programming Contest Systems , 2007 .

[46]  Judithe Sheard,et al.  Analysis of research into the teaching and learning of programming , 2009, ICER '09.

[47]  Jéan H. Greyling,et al.  Marking student programs using graph similarity , 2010, Comput. Educ..

[48]  Kevin F. Collis,et al.  Evaluating the Quality of Learning: The SOLO Taxonomy , 1977 .

[49]  I. B. Myers Manual: A Guide to the Development and Use of the Myers-Briggs Type Indicator , 1985 .

[50]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[51]  Marc Roper,et al.  Investigating the viability of mental models held by novice programmers , 2007, SIGCSE.

[52]  Paul Roe,et al.  Static Analysis of Students' Java Programs , 2004, ACE.

[53]  Janne Lindqvist,et al.  VERKKOKE: learning routing and network programming online , 2007, ITiCSE '07.

[54]  Susan H. Rodger,et al.  JAWAA: easy web-based animation from CS 0 to advanced CS courses , 2003, SIGCSE.

[55]  Alfred C. Weaver,et al.  Electronic commerce virtual laboratory , 2010, SIGCSE.

[56]  Nathan Griffiths,et al.  The boss online submission and assessment system , 2005, JERC.

[57]  Xiang Fu,et al.  The automated web application testing (AWAT) system , 2008, ACM-SE 46.

[58]  D. Krathwohl A Taxonomy for Learning, Teaching and Assessing: , 2008 .

[59]  Andreas Zeller,et al.  Javalanche: efficient mutation testing for Java , 2009, ESEC/SIGSOFT FSE.

[60]  Joseph A. Sant,et al.  "Mailing it in": email-centric automated assessment , 2009, ITiCSE '09.

[61]  Lauri Malmi,et al.  Constructing a core literature for computing education research , 2005, SGCS.

[62]  Lauri Malmi,et al.  MatrixPro - A Tool for On-The-Fly Demonstration of Data Structures and Algorithms , 2004 .

[63]  Yongtao Sun,et al.  Design pattern detection by template matching , 2008, SAC '08.

[64]  Antonija Mitrovic,et al.  Evaluation of a Constraint-Based Tutor for a Database Language , 1999 .

[65]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[66]  Patricia Haden,et al.  Parson's programming puzzles: a fun and effective learning tool for first programming courses , 2006 .

[67]  Manuel Freire Visualizing program similarity in the Ac plagiarism detection system , 2008, AVI '08.

[68]  John B. Biggs,et al.  Teaching for Quality Learning at University: What the Student Does , 1999 .

[69]  Wilfrid S. Sellars,et al.  PHILOSOPHY AND THE SCIENTIFIC IMAGE OF MAN , 2007 .

[70]  Bill Z. Manaris,et al.  Bloom's taxonomy revisited: specifying assessable learning objectives in computer science , 2008, SIGCSE '08.

[71]  David Hovemeyer,et al.  Experiences with marmoset: designing and using an advanced submission and testing system for programming courses , 2006, ITICSE '06.

[72]  Lucas Layman,et al.  Personality types, learning styles, and an agile approach to software engineering education , 2006, SIGCSE '06.

[73]  Lauri Malmi,et al.  Fully automatic assessment of programming exercises , 2001 .

[74]  Lauri Malmi,et al.  Experiences on automatically assessed algorithm simulation exercises with different resubmission policies , 2005, JERC.

[75]  W. F. Hill Learning: A Survey of Psychological Interpretations , 1972 .

[76]  Sarfraz Khurshid,et al.  Generalized Symbolic Execution for Model Checking and Testing , 2003, TACAS.

[77]  Jun Xu,et al.  The Recent Development of Automated Programming Assessment , 2009, 2009 International Conference on Computational Intelligence and Software Engineering.

[78]  Sue Fitzgerald,et al.  Learning styles: novices decide , 2009, ITiCSE.

[79]  Sami Surakka,et al.  Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises , 2006, Baltic Sea '06.

[80]  B. Bloom,et al.  Taxonomy of Educational Objectives. Handbook I: Cognitive Domain , 1966 .

[81]  Kenneth A. Reek,et al.  The TRY system -or- how to avoid testing student programs , 1989, SIGCSE '89.

[82]  J. Carter,et al.  How shall we assess this? , 2003, ITiCSE-WGR '03.

[83]  Lauri Malmi,et al.  Using Roles of Variables in Algorithm Recognition , 2010 .

[84]  J. Hamer A Lightweight Visualizer for Java , 2004 .

[85]  Sarfraz Khurshid,et al.  Test input generation with java PathFinder , 2004, ISSTA '04.

[86]  Michael T. Helmick Interface-based programming assignments and automatic grading of java programs , 2007, ITiCSE.

[87]  Kate Ehrlich,et al.  Empirical Studies of Programming Knowledge , 1984, IEEE Transactions on Software Engineering.

[88]  Ari Korhonen,et al.  Effortless Creation of Algorithm Visualization , 2002 .

[89]  David W. Valentine,et al.  CS educational research: a meta-analysis of SIGCSE technical symposium proceedings , 2004, SIGCSE '04.

[90]  Jack Hollingsworth,et al.  Automatic graders for programming classes , 1960, Commun. ACM.

[91]  Jan B. Hext,et al.  An automatic grading scheme for simple programming exercises , 1969, Commun. ACM.

[92]  Pete Nordquist Providing Accurate And Timely Feedback By Automatically Grading Student Programming Labs , 2007, FECS.

[93]  Hussein Suleman Automatic marking with Sakai , 2008, SAICSIT '08.

[94]  Mordechai Ben-Ari,et al.  Constructivism in computer science education , 1998, SIGCSE '98.

[95]  David Jackson,et al.  Grading student programs using ASSYST , 1997, SIGCSE '97.

[96]  Wolfram Koepf,et al.  Lecture Notes in Computer Science (LNCS) , 2011 .