Source code plagiarism detection: The Unix way

The paper describes similarity detection method for language independent source code similarity detection. It is based on idea of maximum reusability of standard Unix filters. This method was implemented and benchmarked with different datasets from real world (students' assignments) and also synthetic datasets (perfect plagiarism experiment). Our method achieved significantly better results than competitors, which are considered as gold standard in plagiarism detection.

[1]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[2]  Maninder Singh,et al.  Software clone detection: A systematic review , 2013, Inf. Softw. Technol..

[3]  Michael Philippsen,et al.  Finding Plagiarisms among a Set of Programs with JPlag , 2002, J. Univers. Comput. Sci..

[4]  Daniela Chudá,et al.  Checking plagiarism in e-learning , 2010, CompSysTech '10.

[5]  Pavol Návrat,et al.  The Issue of (Software) Plagiarism: A Student View , 2012, IEEE Transactions on Education.

[6]  Z. Duric,et al.  A Source Code Similarity System for Plagiarism Detection , 2013, Comput. J..

[7]  K.W. Bowyer,et al.  Experience using "MOSS" to detect cheating on programming assignments , 1999, FIE'99 Frontiers in Education. 29th Annual Frontiers in Education Conference. Designing the Future of Science and Engineering Education. Conference Proceedings (IEEE Cat. No.99CH37011.

[8]  Tuomo Kakkonen,et al.  Automatic Student Plagiarism Detection: Future Perspectives , 2010 .

[9]  Sami Surakka,et al.  Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises , 2006, Baltic Sea '06.

[10]  Pedro Rangel Henriques,et al.  Plagiarism Detection: A Tool Survey and Comparison , 2014, SLATE.

[11]  Daniel Shawcross Wilkerson,et al.  Winnowing: local algorithms for document fingerprinting , 2003, SIGMOD '03.

[12]  Albert Cabré Juan Studying the Impact of Obfuscation on Source Code Plagiarism Detection , 2014 .

[13]  Bradley Beth,et al.  A Comparison of Similarity Techniques for Detecting Source Code Plagiarism , 2014 .