A Plagiarism Detection Engine for Images in Docode

Plagiarism is turning in someone else's work as your own. Though tools exist for checking the originality of texts, those do not work with images, which can be plagiarized as well. In previous works, we have developed the Docode plagiarism detection system, which works with text, but, as commercial requirements have evolved, it is necessary for it to be able to work with images. In this work we present a plagiarism detection engine for images, which works by fusing texture and color features in a weighted combination, so that it can work as a general-purpose engine. We ran experiments with the system, analyzing the improvement made by fusing color and texture features, and the impact of downsizing images on the performance of the system. We see that we reach a recall at 10 elements of 80% and the system allows an image compression of one half without considerable impact on its performance, and with this we can conclude we can build a plagiarism detection engine for images, able to handle general collections for its integration in the Docode system.

[1]  Mohammad Javad Kargar,et al.  Plagiarism detection of flowchart images in the texts , 2017, 2017 3th International Conference on Web Research (ICWR).

[2]  Muhammad Sharif,et al.  Content Based Image Retrieval: Survey , 2012 .

[3]  K. Matthew Dames Understanding Plagiarism and How It Differs from Copyright Infringement. , 2007 .

[4]  Paul Scheunders,et al.  Wavelets for texture analysis, an overview , 1997 .

[5]  Naomie Salim,et al.  Shape-Based Plagiarism Detection for Flowchart Figures in Texts , 2014, ArXiv.

[6]  Felipe Bravo-Marquez,et al.  DOCODE 3.0 (DOcument COpy DEtector): A system for plagiarism detection by applying an information fusion process from multiple documental data sources , 2016, Inf. Fusion.

[7]  Shamik Sural,et al.  Segmentation and histogram generation using the HSV color space for image retrieval , 2002, Proceedings. International Conference on Image Processing.

[8]  Bonam Kim,et al.  Content-Based Image Retrieval Using Wavelet Spatial-Color and Gabor Normalized Texture in Multi-resolution Database , 2012, 2012 Sixth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.

[9]  Petr Hurtík,et al.  FTIP: A tool for an image plagiarism detection , 2015, 2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR).

[10]  A. Venetsanopoulos,et al.  A color segmentation scheme for object-based video coding , 1998, 1998 IEEE Symposium on Advances in Digital Filtering and Signal Processing. Symposium Proceedings (Cat. No.98EX185).

[11]  M.,et al.  Statistical and Structural Approaches to Texture , 2022 .

[12]  Brejesh Lall,et al.  imPlag: Detecting image plagiarism using hierarchical near duplicate retrieval , 2015, 2015 Annual IEEE India Conference (INDICON).

[13]  Juan D. Velásquez,et al.  Docode 5: Building a real-world plagiarism detection system , 2017, Eng. Appl. Artif. Intell..

[14]  Trygve Randen,et al.  Filtering for Texture Classification: A Comparative Study , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Christoph Zauner,et al.  Implementation and Benchmarking of Perceptual Image Hash Functions , 2010 .

[16]  S. Kodituwakku Comparison of Color Features for Image Retrieval , 2010 .

[17]  Thierry Pun,et al.  Performance evaluation in content-based image retrieval: overview and proposals , 2001, Pattern Recognit. Lett..

[18]  Prabir Kumar Biswas,et al.  A Survey on Current Content based Image Retrieval Methods , 2002 .

[19]  Heting Chu Research in image indexing and retrieval as reflected in the literature , 2001, J. Assoc. Inf. Sci. Technol..

[20]  Emmanuelle Gouillart,et al.  scikit-image: image processing in Python , 2014, PeerJ.

[21]  Naphtali Rishe,et al.  Content-based image retrieval , 1995, Multimedia Tools and Applications.

[22]  Mohammed Mumtaz Al-Dabbagh,et al.  Intelligent Bar Chart Plagiarism Detection in Documents , 2014, TheScientificWorldJournal.

[23]  Prajakta Ovhal Detecting plagiarism in images , 2015, 2015 International Conference on Information Processing (ICIP).