Comparation of Dice Similarity and Jaccard Coefficience Against Winnowing Algorithm For Similarity Detection of Indonesian Text Documents

Plagiarism is the act of imitating and quoting and even copying or acknowledging other people's work as one's own work. Plagiarism is currently growing rapidly, especially in the world of education. So that plagiarism detection is needed to prevent plagiarism from growing rapidly. In response to this, this paper intends to conduct research that compares the dice similarity and the jaccard coefficient to find the best document similarity value level against the Winnowing algorithm which functions to find the fingerprint value of each document. The test results show that the winnowing algorithm is quite good at using the dice similarity level with the results of an average similarity value of 71.17615%  than testing using jaccard coefficient with the resulting value 35,58837%.

[1]  Deepa Gupta,et al.  Investigating the impact of combined similarity metrics and POS tagging in extrinsic text plagiarism detection system , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[2]  Anton Yudhana,et al.  Indonesia Words Detection Using Fingerprint Winnowing Algorithm , 2019, Jurnal Informatika.

[3]  Thomas P. Vartanian,et al.  Secondary Data Analysis , 2010 .