Adaptive weighted fusion with new spatial and temporal fingerprints for improved video copy detection

In this paper, we propose a new and novel modality fusion method designed for combining spatial and temporal fingerprint information to improve video copy detection performance. Most of the previously developed methods have been limited to use only pre-specified weights to combine spatial and temporal modality information. Hence, previous approaches may not adaptively adjust the significance of the temporal fingerprints that depends on the difference between the temporal variances of compared videos, leading to performance degradation in video copy detection. To overcome the aforementioned limitation, the proposed method has been devised to extract two types of fingerprint information: (1) spatial fingerprint that consists of the signs of DCT coefficients in local areas in a keyframe and (2) temporal fingerprint that computes the temporal variances in local areas in consecutive keyframes. In addition, the so-called temporal strength measurement technique is developed to quantitatively represent the amount of the temporal variances; it can be adaptively used to consider the significance of compared temporal fingerprints. The experimental results show that the proposed modality fusion method outperforms other state-of-the-arts fusion methods and popular spatio-temporal fingerprints in terms of video copy detection. Furthermore, the proposed method can save 39.0%, 25.1%, and 46.1% time complexities needed to perform video fingerprint matching without a significant loss of detection accuracy for our synthetic dataset, TRECVID 2009 CCD Task, and MUSCLE-VCD 2007, respectively. This result indicates that our proposed method can be readily incorporated into the real-life video copy detection systems.

[1]  Diane J. Cook,et al.  Automatic Video Classification: A Survey of the Literature , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[2]  Xian-Sheng Hua,et al.  Robust video signature based on ordinal measure , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[3]  Tao Liu,et al.  AT&T Research at TRECVID 2009 Content-based Copy Detection , 2009, TRECVID.

[4]  Li Chen,et al.  Video copy detection: a comparative study , 2007, CIVR '07.

[5]  Paul Over,et al.  TRECVID 2009 -- Goals, Tasks, Data, Evaluation Mechanisms and Metrics | NIST , 2010 .

[6]  Christian Petersohn Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection System , 2004, TRECVID.

[7]  Benjamin Bustos,et al.  Competitive content-based video copy detection using global descriptors , 2011, Multimedia Tools and Applications.

[8]  Tao Liu,et al.  Effective and scalable video copy detection , 2010, MIR '10.

[9]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Shan Gao,et al.  The France Telecom Orange Labs (Beijing) Video Semantic Indexing Systems - TRECVID 2012 Notebook Paper , 2010, TRECVID.

[11]  Chang Dong Yoo,et al.  Robust video fingerprinting for content-based video identification , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Ning Chen,et al.  A robust hashing algorithm based on SURF for video copy detection , 2012, Comput. Secur..

[13]  Y.-N. Li,et al.  Fast video shot boundary detection framework employing pre-processing techniques , 2009, IET Image Process..

[14]  Tomasz Adamek,et al.  DARTs: Efficient scale-space extraction of DAISY keypoints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Borko Furht,et al.  Video identification using video tomography , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[16]  Wesley De Neve,et al.  Near-Duplicate Video Clip Detection Using Model-Free Semantic Concept Detection and Adaptive Semantic Distance Measurement , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Nuria Oliver,et al.  Telefonica Research at TRECVID 2010 Content-Based Copy Detection , 2010, TRECVID.

[18]  Özgür Ulusoy,et al.  Video copy detection using multiple visual cues and MPEG-7 descriptors , 2010, J. Vis. Commun. Image Represent..

[19]  Keiji Yanai,et al.  Web video retrieval based on the Earth Mover’s Distance by integrating color, motion and sound , 2008, 2008 15th IEEE International Conference on Image Processing.

[20]  Chong-Wah Ngo,et al.  Flip-Invariant SIFT for Copy and Object Detection , 2013, IEEE Transactions on Image Processing.

[21]  Adrian Ulges,et al.  Detecting pornographic video content by combining image features with motion information , 2009, ACM Multimedia.

[22]  Kunio Kashino,et al.  NTT Communication Science Laboratories at TRECVID 2010 Content Based Copy Detection , 2010, TRECVID.

[23]  Olivier Buisson,et al.  Robust Content-Based Video Copy Identification in a Large Reference Database , 2003, CIVR.

[24]  Cordelia Schmid,et al.  INRIA LEAR-TEXMEX: Video Copy Detection Task , 2010, TRECVID.

[25]  Sheng Tang,et al.  A Hierarchical Scheme for Rapid Video Copy Detection , 2008, 2008 IEEE Workshop on Applications of Computer Vision.

[26]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[27]  Changick Kim,et al.  Spatiotemporal sequence matching for efficient video copy detection , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Yusuke Uchida,et al.  Fast and accurate content-based video copy detection using bag-of-global visual features , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Olivier Buisson,et al.  Local Behaviours Labelling for Content Based Video Copy Detection , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[30]  Cordelia Schmid,et al.  INRIA @TRECVID 2011: Copy Detection & Multimedia Event Detection , 2011, TRECVID.

[31]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[32]  Hung-Khoon Tan,et al.  VIREO at TRECVID 2010: Semantic Indexing, Known-Item Search, and Content-Based Copy Detection , 2010, TRECVID.

[33]  B. S. Manjunath,et al.  Efficient and Robust Detection of Duplicate Videos in a Large Database , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Nasir D. Memon,et al.  Spatio–Temporal Transform Based Video Hashing , 2006, IEEE Transactions on Multimedia.

[35]  Olivier Buisson,et al.  Scaling content-based video copy detection to very large databases , 2009, Multimedia Tools and Applications.

[36]  Yanqiang Lei,et al.  Video Sequence Matching Based on the Invariance of Color Correlation , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Yao Zhao,et al.  Frame Fusion for Video Copy Detection , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[38]  Changick Kim,et al.  Content-based image copy detection , 2003, Signal Process. Image Commun..

[39]  Luc Van Gool,et al.  Spatio-temporal features for robust content-based video copy detection , 2008, MIR '08.

[40]  Yong Man Ro,et al.  Rotation and flipping robust region binary patterns for video copy detection , 2014, J. Vis. Commun. Image Represent..

[41]  Hong Liu,et al.  A Segmentation and Graph-Based Video Sequence Matching Method for Video Copy Detection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[42]  Duy-Dinh Le,et al.  National Institute of Informatics, Japan at TRECVID 2008 , 2008, TRECVID.

[43]  Rabab Kreidieh Ward,et al.  A Robust and Fast Video Copy Detection System Using Content-Based Fingerprinting , 2011, IEEE Transactions on Information Forensics and Security.

[44]  Ruud M. Bolle,et al.  Comparison of sequence matching techniques for video copy detection , 2001, IS&T/SPIE Electronic Imaging.

[45]  Yi-hua Du,et al.  Structure Information and Temporal Ordinal Measure Fused Video Copy Detection , 2011, 2011 Third International Conference on Multimedia Information Networking and Security.

[46]  Cordelia Schmid,et al.  An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering , 2010, IEEE Transactions on Multimedia.

[47]  Wesley De Neve,et al.  Video Copy Detection Using Inclined Video Tomography and Bag-of-Visual-Words , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[48]  Parham Aarabi,et al.  Tiny Videos: A Large Data Set for Nonparametric Video Retrieval and Frame Classification , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Qi Tian,et al.  Periodicity Detection of Local Motion , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[50]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[51]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008, International Journal of Computer Vision.

[52]  Benjamin Bustos,et al.  Combining Features at Search Time: PRISMA at Video Copy Detection Task , 2011, TRECVID.

[53]  Gitto George Thampi,et al.  Content-based video copy detection using discrete wavelet transform , 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES.

[54]  Fred Stentiford,et al.  Video sequence matching based on temporal ordinal measurement , 2008, Pattern Recognit. Lett..

[55]  Chong-Wah Ngo,et al.  Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.

[56]  Jiwu Huang,et al.  Salient covariance for near-duplicate image and video detection , 2011, 2011 18th IEEE International Conference on Image Processing.