Large-scale video copy retrieval with temporal-concentration SIFT

The scale-invariant feature transform (SIFT) feature plays a very important role in multimedia content analysis, such as near-duplicate image and video retrieval. However, the storage and query costs of SIFT become unbearable for large-scale databases. In this paper, SIFT features are robustly encoded with temporal information by tracking the SIFT to generate temporal-concentration SIFT (TCSIFT), which highly compresses the quantity of local features to reduce visual redundancy, and keeps the advantages of SIFT as much as possible at the same time. On the basis of TCSIFT, a novel framework for large-scale video copy retrieval is proposed in which the processes of retrieval and validation are implemented at the feature and frame level. Experimental results for two different datasets, i.e., CC_WEB_VIDEO and TRECVID, demonstrate that our method can yield comparable accuracy, compact storage size, and more efficient execution time, as well as adapt to various video transformations.

[1]  Yillbyung Lee,et al.  Duplicate video detection for large-scale multimedia , 2015, Multimedia Tools and Applications.

[2]  Meng Wang,et al.  Image quality assessment based on matching pursuit , 2014, Inf. Sci..

[3]  Yongdong Zhang,et al.  Video Copy Detection Based on Trajectory Behavior Pattern: Video Copy Detection Based on Trajectory Behavior Pattern , 2010 .

[4]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[5]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[6]  R. Roopalakshmi,et al.  A novel spatio-temporal registration framework for video copy localization based on multimodal features , 2013, Signal Process..

[7]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[8]  R. Roopalakshmi A novel framework for CBCD using integrated color and acoustic features , 2014, International Journal of Multimedia Information Retrieval.

[9]  Qi Tian,et al.  Fast and robust short video clip search using an index structure , 2004, MIR '04.

[10]  Fei Wang,et al.  Real-time large scale near-duplicate web video retrieval , 2010, ACM Multimedia.

[11]  Olivier Buisson,et al.  Robust voting algorithm based on labels of behavior for video copy detection , 2006, MM '06.

[12]  A. Enis Çetin,et al.  Motion Vector Based Features for Content Based Video Copy Detection , 2010, 2010 20th International Conference on Pattern Recognition.

[13]  Zhouyu Fu,et al.  Semantic-Based Surveillance Video Retrieval , 2007, IEEE Transactions on Image Processing.

[14]  Song Tan,et al.  Large-scale near-duplicate web video search: Challenge and opportunity , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[15]  Wu Xiao,et al.  Video Copy Detection Based on Spatio-Temporal Trajectory Behavior Feature , 2010 .

[16]  Gang Hua,et al.  IBM Research TRECVID-2010 Video Copy Detection and Multimedia Event Detection System , 2010, TRECVID.

[17]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Rabab Kreidieh Ward,et al.  Video Copy Detection Using Temporally Informative Representative Images , 2009, 2009 International Conference on Machine Learning and Applications.

[19]  Xingming Sun,et al.  Segmentation-Based Image Copy-Move Forgery Detection Scheme , 2015, IEEE Transactions on Information Forensics and Security.

[20]  Chong-Wah Ngo,et al.  Scale-Rotation Invariant Pattern Entropy for Keypoint-Based Near-Duplicate Detection , 2009, IEEE Transactions on Image Processing.

[21]  Hung-Khoon Tan,et al.  Scalable detection of partial near-duplicate videos by visual-temporal consistency , 2009, ACM Multimedia.

[22]  Jun Adachi,et al.  Scene duplicate detection based on the pattern of discontinuities in feature point trajectories , 2008, ACM Multimedia.

[23]  Li Chen,et al.  Video copy detection: a comparative study , 2007, CIVR '07.

[24]  Chong-Wah Ngo,et al.  Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval , 2009, Comput. Vis. Image Underst..

[25]  Olivier Buisson,et al.  Content-Based Copy Retrieval Using Distortion-Based Probabilistic Similarity Search , 2007, IEEE Transactions on Multimedia.

[26]  Cordelia Schmid,et al.  An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering , 2010, IEEE Transactions on Multimedia.

[27]  Cedric Nishan Canagarajah,et al.  A Unified Framework for Object Retrieval and Mining , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Zi Huang,et al.  Bounded coordinate system indexing for real-time video clip search , 2009, TOIS.

[29]  Xuelong Li,et al.  Image Annotation by Multiple-Instance Learning With Discriminative Feature Mapping and Selection , 2014, IEEE Transactions on Cybernetics.

[30]  Zi Huang,et al.  Near-duplicate video retrieval: Current research and future trends , 2013, CSUR.

[31]  James Ze Wang,et al.  Content-based image retrieval: approaches and trends of the new age , 2005, MIR '05.

[32]  Yue Gao,et al.  Multimedia encyclopedia construction by mining web knowledge , 2013, Signal Process..

[33]  Edward Y. Chang,et al.  Enhancing DPF for near-replica image recognition , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[34]  Nasir D. Memon,et al.  Spatio–Temporal Transform Based Video Hashing , 2006, IEEE Transactions on Multimedia.

[35]  Zi Huang,et al.  Mining near-duplicate graph for cluster-based reranking of web video search results , 2010, TOIS.