A Review on Near-Duplicate Detection of Images using Computer Vision Techniques

Nowadays, digital content is widespread and simply redistributable, either lawfully or unlawfully. For example, after images are posted on the internet, other web users can modify them and then repost their versions, thereby generating near-duplicate images. The presence of near-duplicates affects the performance of the search engines critically. Computer vision is concerned with the automatic extraction, analysis and understanding of useful information from digital images. The main application of computer vision is image understanding. There are several tasks in image understanding such as feature extraction, object detection, object recognition, image cleaning, image transformation, etc. There is no proper survey in literature related to near duplicate detection of images. In this paper, we review the state-of-the-art computer vision-based approaches and feature extraction methods for the detection of near duplicate images. We also discuss the main challenges in this field and how other researchers addressed those challenges. This review provides research directions to the fellow researchers who are interested to work in this field.

[1]  Alberto Del Bimbo,et al.  Ieee Transactions on Information Forensics and Security 1 a Sift-based Forensic Method for Copy-move Attack Detection and Transformation Recovery , 2022 .

[2]  Dong Xu,et al.  Near Duplicate Identification With Spatially Aligned Pyramid Matching , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Zhe Wang,et al.  High-confidence near-duplicate image detection , 2012, ICMR.

[4]  Junliang Xing,et al.  Load-balanced locality-sensitive hashing: A new method for efficient near duplicate image detection , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[5]  Qi Tian,et al.  Fine-Grained Image Search , 2015, IEEE Transactions on Multimedia.

[6]  Raul Cristian Muresan,et al.  Pattern recognition using pulse-coupled neural networks and discrete Fourier transforms , 2003, Neurocomputing.

[7]  I. Mokris,et al.  Feature generation improving by optimized PCNN , 2008, 2008 6th International Symposium on Applied Machine Intelligence and Informatics.

[8]  Yan Ke,et al.  Efficient Near-duplicate Detection and Sub-image Retrieval , 2004 .

[9]  Xun Tang Book retrieval based on Near-Duplicate image matching , 2012, 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery.

[10]  Cong Wang,et al.  Enabling secure and effective near-duplicate detection over encrypted in-network storage , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[11]  Midhun Mathew,et al.  An Efficient Approach for Finding Near Duplicate Web pages using Minimum Weight Overlapping Method , 2011 .

[12]  Jean-Michel Morel,et al.  ASIFT: A New Framework for Fully Affine Invariant Image Comparison , 2009, SIAM J. Imaging Sci..

[13]  Scott T. Acton,et al.  Slide: Saliency guided image dictionary and image similarity evaluation , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[14]  Ricardo da Silva Torres,et al.  Bayesian approach for near-duplicate image detection , 2011, ICMR.

[15]  Jun Jie Foo,et al.  Using Redundant Bit Vectors for Near-Duplicate Image Detection , 2007, DASFAA.

[16]  Pau-Choo Chung,et al.  Contrast context histogram - An efficient discriminating local descriptor for object recognition and image matching , 2008, Pattern Recognit..

[17]  R. Dhanya,et al.  A state of the art review on copy move forgery detection techniques , 2017, 2017 IEEE International Conference on Circuits and Systems (ICCS).

[18]  Yue Lu,et al.  Near-duplicate document image matching: A graphical perspective , 2014, Pattern Recognit..

[19]  Nenghai Yu,et al.  Query oriented subspace shifting for near-duplicate image detection , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[20]  Muhammad Ghulam,et al.  Passive copy move image forgery detection using undecimated dyadic wavelet transform , 2012, Digit. Investig..

[21]  I-Cheng Chang,et al.  A forgery detection algorithm for exemplar-based inpainting images using multi-region relation , 2013, Image Vis. Comput..

[22]  Luc Van Gool,et al.  Edinburgh Research Explorer Simultaneous Object Recognition and Segmentation by Image Exploration , 2022 .

[23]  Qingming Huang,et al.  Adding Affine Invariant Geometric Constraint for Partial-Duplicate Image Retrieval , 2010, 2010 20th International Conference on Pattern Recognition.

[24]  Andrzej Sluzek,et al.  Detection and segmentation of near-duplicate fragments in random images , 2010, 2010 11th International Conference on Control Automation Robotics & Vision.

[25]  Zhaofeng Li,et al.  Near Duplicate Image Detecting Algorithm based on Bag of Visual Word Model , 2013, J. Multim..

[26]  Liang-Tien Chia,et al.  Exploiting local dependencies with spatial-scale space (S-Cube) for near-duplicate retrieval , 2011, Comput. Vis. Image Underst..

[27]  Trong-Thuc Hoang,et al.  A Real-time Image Feature Extraction Using Pulse-Coupled Neural Network , 2012 .

[28]  Qi Tian,et al.  Interactive social group recommendation for Flickr photos , 2013, Neurocomputing.

[29]  Jean-Michel Morel,et al.  ASIFT: An Algorithm for Fully Affine Invariant Comparison , 2011, Image Process. Line.

[30]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[31]  Ghazali Sulong,et al.  State ofthe art of copy-move forgery detection techniques: a review , 2013 .

[32]  Antonios Gasteratos,et al.  A biologically inspired scale-space for illumination invariant feature detection , 2013 .

[33]  Michael Isard,et al.  General Theory , 1969 .

[34]  Rainer Lienhart,et al.  Bundle min-hashing for logo recognition , 2013, ICMR '13.

[35]  Qi Tian,et al.  Visual word expansion and BSIFT verification for large-scale image search , 2013, Multimedia Systems.

[36]  Peng Li,et al.  Near-duplicate Image Identification with Geometric Consistency Verification , 2012 .

[37]  Xiao Zhang,et al.  Finding Celebrities in Billions of Web Images , 2012, IEEE Transactions on Multimedia.

[38]  Mikolaj Leszczuk,et al.  Practical Application of Near Duplicate Detection for Image Database , 2014, MCSS.

[39]  Honggang Zhang,et al.  An efficient duplicate image detection method based on Affine-SIFT feature , 2010, 2010 3rd IEEE International Conference on Broadband Network and Multimedia Technology (IC-BNMT).

[40]  Xingming Sun,et al.  Effective and Efficient Global Context Verification for Image Copy Detection , 2017, IEEE Transactions on Information Forensics and Security.

[41]  Won-Keun Yang,et al.  Concentric Circle-Based Image Signature for Near-Duplicate Detection in Large Databases , 2010 .

[42]  Wei-Ta Chu,et al.  Consumer photo management and browsing facilitated by near-duplicate detection with feature filtering , 2010, J. Vis. Commun. Image Represent..

[43]  Kouichi Sakurai,et al.  Exploiting reference images for image splicing verification , 2013, Digit. Investig..

[44]  Lei Zhang,et al.  Near Duplicate Image Discovery on One Billion Images , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[45]  Yue Lu,et al.  Variable-Length Signature for Near-Duplicate Image Matching , 2015, IEEE Transactions on Image Processing.

[46]  Amr Badr,et al.  An optimized PCNN for image classification , 2014, 2014 10th International Computer Engineering Conference (ICENCO).

[47]  Chong-Wah Ngo,et al.  Scale-Rotation Invariant Pattern Entropy for Keypoint-Based Near-Duplicate Detection , 2009, IEEE Transactions on Image Processing.

[48]  Amruta Landge,et al.  Near duplicate image matching techniques , 2016, 2016 International Conference on Information Communication and Embedded Systems (ICICES).

[49]  Xinbo Gao,et al.  Graph matching with geometric constraints for near-duplicated image retrieval , 2013, ICIMCS '13.

[50]  Shaharyar Ahmed Khan Tareen,et al.  A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK , 2018, 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET).

[51]  Adrian Popescu,et al.  Fast and robust duplicate image detection on the web , 2016, Multimedia Tools and Applications.

[52]  Yanning Zhang,et al.  Learning Near Duplicate Image Pairs using Convolutional Neural Networks , 2018, International Journal of Performability Engineering.

[53]  Stefan Winkler,et al.  COVERAGE — A novel database for copy-move forgery detection , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[54]  Qi Tian,et al.  Max-SIFT: Flipping invariant descriptors for Web logo search , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[55]  George R. Thoma,et al.  Face Matching for Post-Disaster Family Reunification , 2013, 2013 IEEE International Conference on Healthcare Informatics.

[56]  Yan Ke,et al.  An efficient parts-based near-duplicate and sub-image retrieval system , 2004, MULTIMEDIA '04.

[57]  David Picard,et al.  Preserving local spatial information in image similarity using tensor aggregation of local features , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[58]  K. K. Thyagharajan,et al.  Clustering of near duplicate images using bundled features , 2017, Cluster Computing.

[59]  Jintao Li,et al.  Binary feature from intensity quantization and weakly spatial contextual coding for image search , 2015, Inf. Sci..

[60]  Fred Stentiford,et al.  Comparison of near-duplicate image matching , 2006 .

[61]  Xing Xie,et al.  Coherent Phrase Model for Efficient Image Near-Duplicate Retrieval , 2009, IEEE Transactions on Multimedia.

[62]  Xiaodong Gu Feature Extraction using Unit-linking Pulse Coupled Neural Network and its Applications , 2007, Neural Processing Letters.

[63]  Tony Lindeberg,et al.  Scale Selection Properties of Generalized Scale-Space Interest Point Detectors , 2012, Journal of Mathematical Imaging and Vision.

[64]  R. Kouskouridas,et al.  Improving the robustness in feature detection by local contrast enhancement , 2012, 2012 IEEE International Conference on Imaging Systems and Techniques Proceedings.

[65]  Shin'ichi Satoh Simple low-dimensional features approximating NCC-based image matching , 2011, Pattern Recognit. Lett..

[66]  Shiliang Zhang,et al.  Edge-SIFT: Discriminative Binary Descriptor for Scalable Partial-Duplicate Mobile Search , 2013, IEEE Transactions on Image Processing.

[67]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, CVPR.

[68]  РАСПОЗНАВАНИЕ НЕЧЁТКИХ ДУБЛИКАТОВ ИЗОБРАЖЕНИЙ, ОСНОВАННОЕ НА РАНГОВОМ РАСПРЕДЕЛЕНИИ МОЩНОСТЕЙ КЛАСТЕРОВ ЯРКОСТИ , 2014 .

[69]  Eamonn J. Keogh,et al.  Efficiently Finding Near Duplicate Figures in Archives of Historical Documents , 2012, J. Multim..

[70]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[71]  Mariusz Paradowski,et al.  Bag of Words : Quality Issues of Near-Duplicate Image Retrieval , 2014 .

[72]  K. K. Thyagharajan,et al.  Visual content based clustering of near duplicate web search images , 2013, 2013 International Conference on Green Computing, Communication and Conservation of Energy (ICGCE).

[73]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[74]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[76]  YuDong Zhang,et al.  Pattern Recognition via PCNN and Tsallis Entropy , 2008, Sensors.

[77]  Jun Jie Foo,et al.  Pruning SIFT for Scalable Near-duplicate Image Matching , 2007, ADC.

[78]  Rong Jin,et al.  Large-scale near-duplicate image retrieval by kernel density estimation , 2012, International Journal of Multimedia Information Retrieval.

[79]  Yide Ma,et al.  Feature Extraction from Noisy Image Using PCNN , 2006, 2006 IEEE International Conference on Information Acquisition.

[80]  Gang Hua,et al.  Modeling spatial and semantic cues for large-scale near-duplicated image retrieval , 2011, Comput. Vis. Image Underst..

[81]  Bing Yang,et al.  Near-Duplicate Image Retrieval Based on Contextual Descriptor , 2015, IEEE Signal Processing Letters.

[82]  Justin Zobel,et al.  Detection of near-duplicate images for web search , 2007, CIVR '07.

[83]  Zujun Hou,et al.  Keypoint-based near-duplicate images detection using affine invariant feature and color matching , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[84]  Christian Riess,et al.  Ieee Transactions on Information Forensics and Security an Evaluation of Popular Copy-move Forgery Detection Approaches , 2022 .

[85]  Andrew Zisserman,et al.  Near Duplicate Image Detection: min-Hash and tf-idf Weighting , 2008, BMVC.

[86]  Jianyu Yang,et al.  Invariant multi-scale shape descriptor for object matching and recognition , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[87]  Winston H. Hsu,et al.  Online image search result grouping with MapReduce-based image clustering and graph construction for large-scale photos , 2014, J. Vis. Commun. Image Represent..

[88]  Julian Stöttinger,et al.  Efficient and robust near-duplicate detection in large and growing image data-sets , 2010, ACM Multimedia.

[89]  Qian Zhang,et al.  Geometric Consistent Tree Partitioning Min-Hash for Large-Scale Partial Duplicate Image Discovery , 2015, 2015 IEEE International Conference on Multimedia Big Data.

[90]  Chong-Wah Ngo,et al.  Multimodal News Story Clustering With Pairwise Visual Near-Duplicate Constraint , 2008, IEEE Transactions on Multimedia.

[91]  Giovanni Maria Farinella,et al.  Bags of phrases with codebooks alignment for near duplicate image detection , 2010, MiFor '10.

[92]  Giovanni Maria Farinella,et al.  Aligning codebooks for near duplicate image detection , 2014, Multimedia Tools and Applications.

[93]  Sonja Grgic,et al.  CoMoFoD — New database for copy-move forgery detection , 2013, Proceedings ELMAR-2013.

[94]  Jenq-Haur Wang,et al.  Finding Event-Relevant Content from the Web Using a Near-Duplicate Detection Approach , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[95]  S. Shivashankar,et al.  Conceptual level similarity measure based review spam detection , 2010, 2010 International Conference on Signal and Image Processing.

[96]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[97]  Qi Tian,et al.  Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb , 2014, Comput. Vis. Image Underst..

[98]  Justin Zobel,et al.  Clustering near-duplicate images in large collections , 2007, MIR '07.

[99]  Sheng Tang,et al.  Efficient Feature Detection and Effective Post-Verification for Large Scale Near-Duplicate Image Search , 2011, IEEE Transactions on Multimedia.

[100]  Yanqiang Lei,et al.  Geometric invariant features in the Radon transform domain for near-duplicate image detection , 2014, Pattern Recognit..

[101]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[102]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[103]  Evaggelos Spyrou,et al.  A survey on Flickr multimedia research challenges , 2016, Eng. Appl. Artif. Intell..

[104]  Stefan Winkler,et al.  California-ND: An annotated dataset for near-duplicate detection in personal photo collections , 2013, 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX).

[105]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[106]  Rohit Prasad,et al.  Detecting near-duplicate document images using interest point matching , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).