A Method for Detecting Program Plagiarism Comparing Class Structure Graphs

Recently, lots of research results on program comparison have been reported since the code theft become frequent as the increase of code mobility. This paper proposes a plagiarism detection method using class structures. The proposed method constructs a graph representing the referential relationship between the member variables and the methods. This relationship is shown as a bipartite graph and the test for graph isomorphism is applied on the set of graphs to measure the similarity of the programs. In order to measure the effectiveness of this method, an experiment was conducted on the test set, the set of Java source codes submitted as solutions for the programming assignments in Object-Oriented Programming course of Pusan National University in 2012. In order to evaluate the accuracy of the proposed method, the F-measure is compared to those of JPlag and Stigmata. According to the experimental result, the F-measure of the proposed method is higher than those of JPlag and Stigmata by 0.17 and 0.34, respectively.

[1]  Christian S. Collberg,et al.  Detecting Software Theft via Whole Program Path Birthmarks , 2004, ISC.

[2]  S. Narayanan,et al.  Source code plagiarism detection and performance analysis using fingerprint based distance measure method , 2012, 2012 7th International Conference on Computer Science & Education (ICCSE).

[3]  Christian S. Collberg,et al.  K-gram based software birthmarks , 2005, SAC '05.

[4]  Michael Philippsen,et al.  Finding Plagiarisms among a Set of Programs with JPlag , 2002, J. Univers. Comput. Sci..

[5]  Seong-je Cho,et al.  A Static Birthmark of Windows Binary Executables Based on Strings , 2013, 2013 Seventh International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.

[6]  Akito Monden,et al.  Design and evaluation of dynamic software birthmarks based on API calls , 2007 .

[7]  Philip S. Yu,et al.  GPLAG: detection of software plagiarism by program dependence graph analysis , 2006, KDD '06.

[8]  Siu-Ming Yiu,et al.  Dynamic Software Birthmark for Java Based on Heap Memory Analysis , 2011, Communications and Multimedia Security.

[9]  Byung-Rae Cha Digital License Searching for Copyright Management of Software Source Code , 2007 .

[10]  Jaeseok Kim,et al.  Image Watermarking for Identification Forgery Prevention , 2011 .

[11]  Sencun Zhu,et al.  Behavior based software theft detection , 2009, CCS.

[12]  Akito Monden,et al.  Java Birthmarks - Detecting the Software Theft - , 2005, IEICE Trans. Inf. Syst..

[13]  Akito Monden,et al.  Dynamic Software Birthmarks to Detect the Theft of Windows Applications , 2004 .

[14]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Seong-je Cho,et al.  A Static Birthmark for MS Windows Applications Using Import Address Table , 2013, 2013 Seventh International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.

[16]  Hyun-il Lim,et al.  A Static Birthmark of Binary Executables Based on API Call Structure , 2007, ASIAN.

[17]  Chanchal K. Roy,et al.  A Survey on Software Clone Detection Research , 2007 .

[18]  Hwan-Gue Cho,et al.  A source code linearization technique for detecting plagiarized programs , 2007, ITiCSE.