Existing plagiarism detection techniques: A systematic mapping of the scholarly literature

Purpose – The purpose of this paper is to analyse the state-of-the-art techniques used to detect plagiarism in terms of their limitations, features, taxonomies and processes. Design/methodology/approach – The method used to execute this study consisted of a comprehensive search for relevant literature via six online database repositories namely; IEEE xplore, ACM Digital Library, ScienceDirect, EI Compendex, Web of Science and Springer using search strings obtained from the subject of discussion. Findings – The findings revealed that existing plagiarism detection techniques require further enhancements as existing techniques are incapable of efficiently detecting plagiarised ideas, figures, tables, formulas and scanned documents. Originality/value – The contribution of this study lies in its ability to have exposed the current trends in plagiarism detection researches and identify areas where further improvements are required so as to complement the performances of existing techniques.

[1]  Norman Meuschke,et al.  State-of-the-art in detecting academic plagiarism , 2013 .

[2]  Benno Stein,et al.  Plagiarism Detection Without Reference Collections , 2006, GfKl.

[3]  Naomie Salim,et al.  Understanding Plagiarism Linguistic Patterns, Textual Features, and Detection Methods , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[4]  Stan Matwin,et al.  Intrinsic Plagiarism Detection using Complexity Analysis , 2009 .

[5]  Jan Kasprzak,et al.  Improving the Reliability of the Plagiarism Detection System - Lab Report for PAN at CLEF 2010 , 2010, CLEF.

[6]  Alberto Barrón-Cedeño,et al.  On Automatic Plagiarism Detection Based on n-Grams Comparison , 2009, ECIR.

[7]  Efstathios Stamatatos,et al.  Plagiarism detection using stopword n-grams , 2011, J. Assoc. Inf. Sci. Technol..

[8]  Rynson W. H. Lau,et al.  CHECK: a document plagiarism detection system , 1997, SAC '97.

[9]  Naomie Salim,et al.  Plagiarism Detection Using Graph-Based Representation , 2010, ArXiv.

[10]  Iryna Gurevych,et al.  UKP: Computing Semantic Textual Similarity by Combining Multiple Content Similarity Measures , 2012, *SEMEVAL.

[11]  Janis Grundspenkis,et al.  Computer-based plagiarism detection methods and tools: an overview , 2007, CompSysTech '07.

[12]  Roman Kern,et al.  External and Intrinsic Plagiarism Detection Using Vector Space Models , 2009 .

[13]  Zuhair Bandar,et al.  Sentence similarity based on semantic nets and corpus statistics , 2006, IEEE Transactions on Knowledge and Data Engineering.

[14]  Bassel Alkhatib,et al.  The Implementation of Plagiarism Detection System in Health Sciences Publications in Arabic and English Languages , 2013 .

[15]  Maria Soledad Pera,et al.  Nowhere to Hide: Finding Plagiarized Documents Based on Sentence Similarity , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[16]  Dil Muhammad Akbar Hussain,et al.  Plagiarism Detection Based on SCAM Algorithm , 2011, IMECS 2011.

[17]  Tommy W. S. Chow,et al.  Multilayer SOM With Tree-Structured Data for Efficient Document Retrieval and Plagiarism Detection , 2009, IEEE Transactions on Neural Networks.

[18]  Efstathios Stamatatos,et al.  Intrinsic Plagiarism Detection Using Character n-gram Profiles , 2009 .

[19]  Mary Idicula Sumam,et al.  A Copy detection Method for Malayalam Text Documents using N-grams Model , 2013 .

[20]  Anubhav Srivastava,et al.  Intelligent plagiarism detection mechanism using semantic technology: A different approach , 2013, 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[21]  Sergey Butakov,et al.  Using Microsoft SQL Server platform for plagiarism detection , 2009 .

[22]  Naomie Salim,et al.  iPlag: Intelligent Plagiarism Reasoner in scientific publications , 2011, 2011 World Congress on Information and Communication Technologies.

[23]  Cristian Grozea,et al.  ENCOPLOT: Pairwise Sequence Matching in Linear Time Applied to Plagiarism Detection ∗ , 2009 .

[24]  Yasuhiko Morimoto,et al.  BAENPD: A Bilingual Plagiarism Detector , 2013, J. Comput..

[25]  Mohamed Elhadi,et al.  Use of text syntactical structures in detection of document duplicates , 2008, 2008 Third International Conference on Digital Information Management.

[26]  Man Yan Miranda Chong,et al.  A study on plagiarism detection and plagiarism direction identification using natural language processing techniques , 2013 .

[27]  Naomie Salim,et al.  Features Based Text Similarity Detection , 2010, ArXiv.

[28]  Alberto Barrón-Cedeño,et al.  Methods for cross-language plagiarism detection , 2013, Knowl. Based Syst..

[29]  Ali Selamat,et al.  A systematic literature review of software requirements prioritization research , 2014, Inf. Softw. Technol..

[30]  Benno Stein,et al.  Cross-language plagiarism detection , 2011, Lang. Resour. Evaluation.

[31]  Naomie Salim,et al.  An improved plagiarism detection scheme based on semantic role labeling , 2012, Appl. Soft Comput..

[32]  Tommy W. S. Chow,et al.  A coarse-to-fine framework to efficiently thwart plagiarism , 2011, Pattern Recognit..

[33]  Dinesh U Acharya,et al.  SEMANTIC PLAGIARISM DETECTION SYSTEM USING ONTOLOGY MAPPING , 2012 .

[34]  Yiu-Kai Ng,et al.  A Sentence-Based Copy Detection Approach for Web Documents , 2005, FSKD.

[35]  Alberto Barrón-Cedeño,et al.  Reducing the Plagiarism Detection Search Space on the Basis of the Kullback-Leibler Distance , 2009, CICLing.

[36]  Justin Zobel,et al.  Methods for Identifying Versioned and Plagiarized Documents , 2003, J. Assoc. Inf. Sci. Technol..

[37]  Benno Stein,et al.  An Evaluation Framework for Plagiarism Detection , 2010, COLING.

[38]  Naomie Salim,et al.  Fuzzy Semantic Plagiarism Detection , 2012, AMLTA.

[39]  Benno Stein,et al.  Intrinsic Plagiarism Detection , 2006, ECIR.

[40]  Naomie Salim,et al.  On the use of fuzzy information retrieval for gauging similarity of Arabic documents , 2009, 2009 Second International Conference on the Applications of Digital Information and Web Technologies.

[41]  Naomie Salim,et al.  CONCEPTUAL SIMILARITY AND GRAPH-BASED METHOD FOR PLAGIARISM DETECTION , 2011 .

[42]  Benno Stein,et al.  Near Similarity Search and Plagiarism Analysis , 2005, GfKl.

[43]  Mohamed El Bachir Menai,et al.  Detection of Plagiarism in Arabic Documents , 2012 .