A source code linearization technique for detecting plagiarized programs

It is very important to detect plagiarized programs in the field of computer science education. Therefore, many tools and algorithms have been developed for this purpose. Generally, these tools are operated in two phases. In phase 1, a program plagiarism detecting tool generates an intermediate representation from a given program set. The intermediate representation should reflect the structural characterization of the program. Most tools use the parse tree or token sequence by intermediate representation. In phase 2, the program looks for plagiarized material and evaluates the similarity of two programs. It is helpful to announce the plagiarized metarials between two programs to the instructor. In this paper, we present the static tracing method in order to improve program plagiarism detection accuracy. The static tracing method statically executes a program at the syntax-level and then extracts predefined keywords according to the order of the executed functions. The result of experiment proves this method can detect plagiarism more effectively than the previously released plagiarism detecting method.

[1]  S. K. Robinson,et al.  An empirical approach for detecting program similarity and plagiarism within a university programming environment , 1987 .

[2]  John G. Meinke Proceedings of the seventh annual consortium for computing in small colleges central plains conference on The journal of computing in small colleges , 2001 .

[3]  Nicholas Tran,et al.  Sim: a utility for detecting similarity in computer programs , 1999, SIGCSE '99.

[4]  J. Howard Johnson,et al.  Identifying redundancy in source code using fingerprints , 1993, CASCON.

[5]  Michael J. Wise Detection of similarities in student programs: YAP'ing may be preferable to plague'ing , 1992, SIGCSE '92.

[6]  Michael Philippsen,et al.  Finding Plagiarisms among a Set of Programs with JPlag , 2002, J. Univers. Comput. Sci..

[7]  G. Whale Indentification of Program Similarity in Large Populations , 1990, Comput. J..

[8]  James O. Hamblen,et al.  Computer algorithms for plagiarism detection , 1989 .

[9]  Kevin C. Almeroth,et al.  An Automated System for Plagiarism Detection Using the Internet , 2004 .

[10]  Michael J. Wise,et al.  Software for detecting suspected plagiarism: comparing structure and attribute-counting systems , 1996, ACSE '96.

[11]  Seong-Bae Park,et al.  Program plagiarism detection using parse tree Kernels , 2006 .

[12]  Michael Luck,et al.  Plagiarism in programming assignments , 1999 .

[13]  Katsuro Inoue,et al.  Measuring Similarity of Large Software Systems Based on Source Code Correspondence , 2005, PROFES.

[14]  Shauna D. Stephens Using metrics to detect plagiarism (student paper) , 2001 .

[15]  Andrew Lim,et al.  On automated grading of programming assignments in an academic institution , 2003, Comput. Educ..

[16]  Hector Garcia-Molina,et al.  Copy detection mechanisms for digital documents , 1995, SIGMOD '95.