A Similarity Detection Platform for Programming Learning

Code similarity detection has been studied for several decades, which are prevailing categorized into attributecounting and structure-metric. Due to the one fold validity of attribute-counting for full replication, mature systems usually use the GST string matching algorithm to detect code structure. However, the accuracy of GST is vulnerable to interference in code similarity detection. This paper presents a code similarity detection method combining string matching and sub-graph isomorphism. The similarity is calculated with the GST algorithm. Then according to the similarity, the system determines whether further processing with the sub-graph iIsomorphism algorithm is required. Extensive experimental results illustrate that our method significantly enhances the efficiency of string matching as well as the accuracy of code similarity detecting.