Evaluating plagiarism detection software for introductory programming assignments

Plagiarism is an issue that all educators have had to deal with. Large numbers of students and assignments have resulted in the development of automated systems to detect code similarities with the aim of identifying cases that may have been plagiarised. These systems are of great value to assessors, allowing them to process submissions automatically. However, these automated systems do present possible disadvantages and drawbacks. In this study we explore and analyse the differences between various systems as well as how their performance compares with manual checking. We consider the different methods students use when committing plagiarism. Then we examine more closely the systems that can aid plagiarism detection, ranging from their characteristics to how they work. In the process, we determine how these systems compare with our own system and their suitability for aiding the identification of submissions which may have been plagiarised in our introductory C++ course.

[1]  Pedro Rangel Henriques,et al.  Plagiarism Detection: A Tool Survey and Comparison , 2014, SLATE.

[2]  K.W. Bowyer,et al.  Experience using "MOSS" to detect cheating on programming assignments , 1999, FIE'99 Frontiers in Education. 29th Annual Frontiers in Education Conference. Designing the Future of Science and Engineering Education. Conference Proceedings (IEEE Cat. No.99CH37011.

[3]  Michael Luck,et al.  Plagiarism in programming assignments , 1999 .

[4]  Fintan Culwin,et al.  A Comparison of Source Code Plagiarism Detection Engines , 2004, Comput. Sci. Educ..

[5]  Mümine Kaya,et al.  Integrating an online compiler and a plagiarism detection tool into the Moodle distance education system for easy assessment of programming assignments , 2015, Comput. Appl. Eng. Educ..

[6]  Mohamed El Bachir Menai,et al.  Similarity detection in Java programming assignments , 2010, 2010 5th International Conference on Computer Science & Education.

[7]  Michael Philippsen,et al.  Finding Plagiarisms among a Set of Programs with JPlag , 2002, J. Univers. Comput. Sci..

[8]  Agung Toto Wibowo,et al.  Comparison between fingerprint and winnowing algorithm to detect plagiarism fraud on Bahasa Indonesia documents , 2013, 2013 International Conference of Information and Communication Technology (ICoICT).

[9]  Xin Chen,et al.  Shared information and program plagiarism detection , 2004, IEEE Transactions on Information Theory.

[10]  Manuel Freire Visualizing program similarity in the Ac plagiarism detection system , 2008, AVI '08.

[11]  Vreda Pieterse Decoding code plagiarism , 2014 .

[12]  Jurriaan Hage,et al.  Plagiarism detection for Java: a tool comparison , 2011, CSERC.

[13]  Jurriaan Hage,et al.  Research paper: Plagiarism Detection for Haskell with Holmes , 2013, CSERC.

[14]  Daniel Shawcross Wilkerson,et al.  Winnowing: local algorithms for document fingerprinting , 2003, SIGMOD '03.

[15]  Charlie Daly,et al.  Patterns of plagiarism , 2005, SIGCSE '05.

[16]  Nicholas Tran,et al.  Sim: a utility for detecting similarity in computer programs , 1999, SIGCSE '99.

[17]  Mike Paterson,et al.  A Faster Algorithm Computing String Edit Distances , 1980, J. Comput. Syst. Sci..

[18]  Ronald L. Rivest,et al.  The MD5 Message-Digest Algorithm , 1992, RFC.

[19]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[20]  Vreda Pieterse,et al.  Dealing with Plagiarism in Introductory Programming , 2015, ICCSE 2015.

[21]  Manuel Cebrián,et al.  AC: An Integrated Source Code Plagiarism Detection Environment , 2007, ArXiv.

[22]  Vreda Pieterse,et al.  Automated Assessment of Programming Assignments , 2013, CSERC.

[23]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[24]  Richard M. Karp,et al.  Efficient Randomized Pattern-Matching Algorithms , 1987, IBM J. Res. Dev..

[25]  Jacob Brunekreef,et al.  Measuring static quality of student code , 2011, ITiCSE '11.

[26]  Romain Robbes,et al.  Language-Independent Clone Detection Applied to Plagiarism Detection , 2010, 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation.

[27]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[28]  Sami Surakka,et al.  Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises , 2006, Baltic Sea '06.