Near-duplicate video clustering using multiple complementary video signatures

A near-duplicate video clustering algorithm based on multiple complementary video signatures is proposed in this work. We use three kinds of frame descriptors: RGB histogram, color name histogram, and ternary pattern. Then, we convert each kind of frame descriptors for a video into a video signature based on the bag-of-visual-words scheme. Consequently, we have three signatures to represent the video. These signatures are complementary to one another, since they are robust to different near-duplication types. Also, we develop a clustering technique to refine pairwise matching results and categorize near-duplicate videos. Experimental results on an extensive video dataset show that the proposed algorithm detects near-duplicate videos more effectively than conventional algorithms.

[1]  Hung-Khoon Tan,et al.  Real-Time Near-Duplicate Elimination for Web Video Search With Content and Context , 2009, IEEE Transactions on Multimedia.

[2]  Fei Wang,et al.  Real-time large scale near-duplicate web video retrieval , 2010, ACM Multimedia.

[3]  Giovanni Maria Farinella,et al.  Scene categorization using bag of Textons on spatial hierarchy , 2008, 2008 15th IEEE International Conference on Image Processing.

[4]  Xian-Sheng Hua,et al.  Robust video signature based on ordinal measure , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[5]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[6]  Cordelia Schmid,et al.  Event Retrieval in Large Video Collections with Circulant Temporal Encoding , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[9]  Yap-Peng Tan,et al.  Video organization: Near-Duplicate Video clustering , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[10]  Stefan Poslad,et al.  An Enhanced Bag-of-Visual Word Vector Space Model to Represent Visual Content in Athletics Images , 2012, IEEE Transactions on Multimedia.

[11]  Chong-Wah Ngo,et al.  Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.

[12]  Jiwu Huang,et al.  Salient covariance for near-duplicate image and video detection , 2011, 2011 18th IEEE International Conference on Image Processing.

[13]  Afzal Godil,et al.  Exploring the Bag-of-Words method for 3D shape retrieval , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[14]  E. Hellinger,et al.  Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. , 1909 .

[15]  Yao Zhao,et al.  Frame Fusion for Video Copy Detection , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Zi Huang,et al.  Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.

[17]  Xin Yang,et al.  Near-duplicate detection for images and videos , 2009, LS-MMRM '09.

[18]  Rita Cucchiara,et al.  Enhancing HSV histograms with achromatic points detection for video retrieval , 2007, CIVR '07.

[19]  Hong Liu,et al.  Gradient Ordinal Signature and Fixed-Point Embedding for Efficient Near-Duplicate Video Detection , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Avideh Zakhor,et al.  Fast similarity search and clustering of video sequences on the world-wide-web , 2005, IEEE Transactions on Multimedia.

[22]  Shiyan Hu,et al.  Efficient video retrieval by locality sensitive hashing , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[23]  Bir Bhanu,et al.  Integrated personalized video summarization and retrieval , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[24]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.