论文信息 - IMPROVING THE RELIABILITY AND VALIDITY OF TEST DATA ADEQUACY IN PROGRAMMING ASSESSMENTS

IMPROVING THE RELIABILITY AND VALIDITY OF TEST DATA ADEQUACY IN PROGRAMMING ASSESSMENTS

Automatic Programming Assessment (or APA) has recently become a notable method in assisting educators of programming courses to automatically assess and grade students’ programming exercises as its counterpart; the typical manual tasks are prone to errors and lead to inconsistency. Practically, this method also provides an alternative means of reducing the educators’ workload effectively. By default, test data generation process plays an important role to perform a dynamic testing on students’ programs. Dynamic testing involves the execution of a program against different inputs or test data and the comparison of the results with the expected output, which must conform to the program specifications. In the software testing field, there have been diverse automated methods for test data generation. Unfortunately, APA rarely adopts these methods. Limited studies have attempted to integrate APA and test data generation to include more useful features and to provide a precise and thorough quality program testing. Thus, we propose a framework of test data generation known as FaSt-Gen covering both the functional and structural testing of a program for APA. Functional testing is a testing that relies on specified functional requirements and focuses the output generated in response to the selected test data and execution, Meanwhile, structural testing looks at the specific program logic to verify how it works. Overall, FaSt-Gen contributes as a means to educators of programming courses to furnish an adequate set of test data to assess students’ programming solutions regardless of having the optimal expertise in the particular knowledge of test cases design. FaSt-Gen integrates the positive and negative testing criteria or so-called reliable and valid test adequacy criteria to derive desired test data and test set schema. As for the functional testing, the integration of specification-derived test and simplified boundary value analysis techniques covering both the criteria. Path coverage criterion guides the test data selection for structural testing. The findings from the conducted controlled experiment and comparative study evaluation show that FaSt-Gen improves the reliability and validity of test data adequacy in programming assessments.

[1] Jürgen Wolff von Gudenberg,et al. Improving the quality of programming education by online assessment , 2006, PPPJ '06.

[2] Shahida Sulaiman,et al. Automatic programming assessment and test data generation a review on its approaches , 2010, 2010 International Symposium on Information Technology.

[3] Lauri Malmi,et al. Visual Algorithm Simulation Exercise System with Automatic Assessment: TRAKLA2 , 2004, Informatics Educ..

[4] Roy P. Pargas,et al. Test‐data generation using genetic algorithms , 1999 .

[5] Harsh Bhasin,et al. Cellular automata based test data generation , 2013, SOEN.

[6] Ian Sommerville,et al. Software engineering (5th ed.) , 1995 .

[7] Richard Bache,et al. Software Metrics for Product Assesment , 1993 .

[8] Jordi Campos Miralles,et al. System for Automated Assistance in Correction of Programming Exercises (SAC) , 2008 .

[9] Olly Gotel,et al. Extending and contributing to an open source web-based system for the assessment of programming problems , 2007, PPPJ.

[10] Masayuki Fukuzawa,et al. Development of an e-learning back-end system for code assessment in elementary programming practice , 2010, SIGUCCS '10.

[11] Lori A. Clarke,et al. A System to Generate Test Data and Symbolically Execute Programs , 1976, IEEE Transactions on Software Engineering.

[12] José Paulo Leal,et al. PETCHA: a programming exercises teaching assistant , 2012, ITiCSE '12.

[13] Nikolai Tillmann,et al. Pex-White Box Test Generation for .NET , 2008, TAP.

[14] Supasit Monpratarnchai,et al. Automated testing for Java programs using JPF-based test case generation , 2014, SOEN.

[15] Nigel James Tracey,et al. A search-based automated test-data generation framework for safety-critical software , 2000 .

[16] John E. Dobson,et al. FAST: a framework for automating statistics-based testing , 2004, Software Quality Journal.

[17] Stephen H. Edwards,et al. Improving student performance by evaluating how well students test their own programs , 2003, JERC.

[18] Yan Zhang,et al. Evolutionary generation of test data for path coverage with faults detection , 2011, 2011 Seventh International Conference on Natural Computation.

[19] Mauro Pezzè,et al. Software testing and analysis - process, principles and techniques , 2007 .

[20] Steven C. Shaffer. Ludwig: an online programming tutoring and assessment system , 2005, SGCS.

[21] John B. Goodenough,et al. Correction to "toward a theory of test data selection" , 1975, IEEE Transactions on Software Engineering.

[22] Basel A. Mahafzah,et al. Using program data-state scarcity to guide automatic test data generation , 2010, Software Quality Journal.

[23] Shaoying Liu,et al. Generating test data from state‐based specifications , 2003, Softw. Test. Verification Reliab..

[24] Lauri Malmi,et al. Experiences in automatic assessment on mass courses and issues for designing virtual courses , 2002, ITiCSE '02.

[25] Paul Roe,et al. Learning to program through the web , 2005, ITiCSE '05.

[26] Guido Rößling,et al. WebTasks: online programming exercises made easy , 2008, ITiCSE.

[27] Abdul Razak Hamdan,et al. Skema Penjanaan Data dan Pemberat Ujian Berasaskan Kaedah Analisis Nilai Sempadan , 2012 .

[28] Kai Qian,et al. Design of Online Runtime and Testing Environment for Instant Java Programming Assessment , 2010, 2010 Seventh International Conference on Information Technology: New Generations.

[29] David Jackson. A semi-automated approach to online assessment , 2000, ITiCSE '00.

[30] L. E. Deimel,et al. The TODISK-WATLOAD system: a convenient tool for evaluating student programs , 1978, ACM-SE 16.

[31] Shahida Sulaiman,et al. Designing a Test Set for Structural Testing in Automatic Programming Assessment , 2013, SOCO 2013.

[32] Edward L. Jones. Grading student programs - a software testing approach , 2001 .

[33] Daniel Jackson,et al. A software system for grading student computer programs , 1996, Comput. Educ..

[34] Vallipuram Muthukkumarasamy,et al. GAME: a Generic Automated Marking Environment for programming assessment , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[35] Athanasios Tsintsifas,et al. Automated assessment and experiences of teaching programming , 2005, JERC.

[36] Julia Isong. DEVELOPING AN AUTOMATED PROGRAM CHECKER STUDENT PAPER , 2001 .

[37] Petri Ihantola. Automatic test data generation for programming exercises with symbolic execution and Java PathFinder , 2006 .

[38] Mark Harman,et al. Input Domain Reduction through Irrelevant Variable Removal and Its Effect on Local, Global, and Hybrid Search-Based Structural Test Data Generation , 2012, IEEE Transactions on Software Engineering.

[39] Miguel A. Redondo,et al. Using fuzzy logic applied to software metrics and test cases to assess programming assignments and give advice , 2012, J. Netw. Comput. Appl..

[40] Ashraf Elnagar,et al. An Intelligent Assessment Tool for Students’ Java Submissions in Introductory Programming Courses , 2012 .

[41] Rosemary Monahan,et al. nExaminer: A Semi-automatedComputer Programming Assignment Assessment Framework for Moodle , 2011 .

[42] Norman E. Wallen,et al. How to Design and Evaluate Research in Education , 1990 .

[43] Shahida Sulaiman,et al. Current Practices of Programming Assessment at Higher Learning Institutions , 2011, ICSECS.

[44] Lauri Malmi,et al. QR code programming tasks with automated assessment , 2014, ITiCSE '14.

[45] Gao Hailing. Tool for Automated Test Data Generation Based on GA , 2005 .

[46] José Luis Fernández Alemán. Automated Assessment in a Programming Tools Course , 2011, IEEE Trans. Educ..

[47] T. Benouhiba,et al. Targeted adequacy criteria for search-based test data generation , 2012, 2012 International Conference on Information Technology and e-Services.

[48] Fred Martin,et al. Impact of auto-grading on an introductory computing course , 2013 .

[49] A. Pons,et al. Oto, a generic and extensible tool for marking programming assignments , 2008 .

[50] David Jackson,et al. Grading student programs using ASSYST , 1997, SIGCSE '97.

[51] Phil McMinn,et al. Search‐based software test data generation: a survey , 2004, Softw. Test. Verification Reliab..

[52] Sumit Gulwani,et al. Teaching and learning programming and software engineering via interactive gaming , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[53] Julia Isong,et al. Developing an automated program checkers , 2001 .

[54] M. Choy,et al. Experiences in Using an Automated System for Improving Students' Learning of Computer Programming , 2005, ICWL.

[55] William E. Howden,et al. An evaluation of the effectiveness of symbolic testing , 1978, Softw. Pract. Exp..

[56] Alan C. Gillies,et al. Software Quality: Theory and Management , 1992 .

[57] Michael Luck,et al. A secure on-line submission system , 1999 .

[58] Ilene Burnstein,et al. Practical Software Testing , 2003, Springer Professional Computing.

[59] Petri Ihantola,et al. Review of recent systems for automatic assessment of programming assignments , 2010, Koli Calling.

[60] Lauri Malmi,et al. Fully automatic assessment of programming exercises , 2001 .

[61] Mary Lou Soffa,et al. Automated test data generation using an iterative relaxation method , 1998, SIGSOFT '98/FSE-6.