Real-time large scale near-duplicate web video retrieval

Near-duplicate video retrieval is becoming more and more important with the exponential growth of the Web. Though various approaches have been proposed to address this problem, they are mainly focusing on the retrieval accuracy while infeasible to query on Web scale video database in real time. This paper proposes a novel method to address the efficiency and scalability issues for near-duplicate We video retrieval. We introduce a compact spatiotemporal feature to represent videos and construct an efficient data structure to index the feature to achieve real-time retrieving performance. This novel feature leverages relative gray-level intensity distribution within a frame and temporal structure of videos along frame sequence. The new index structure is proposed based on inverted file to allow for fast histogram intersection computation between videos. To demonstrate the effectiveness and efficiency of the proposed methods we evaluate its performance on an open Web video data set containing about 10K videos and compare it with four existing methods in terms of precision and time complexity. We also test our method on a data set containing about 50K videos and 11M key-frames. It takes on average 17ms to execute a query against the whole 50K Web video data set.

[1]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[2]  F. Fleuret Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..

[3]  Chong-Wah Ngo,et al.  Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.

[4]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Song Tan,et al.  Large-scale near-duplicate web video search: Challenge and opportunity , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[6]  Olivier Buisson,et al.  Video and image copy detection demo , 2007, CIVR '07.

[7]  Zhe Wang,et al.  Efficiently matching sets of features with random histograms , 2008, ACM Multimedia.

[8]  Julien Law-To,et al.  INRIA-IMEDIA TRECVID 2008: Video Copy Detection , 2008, TRECVID.

[9]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Cordelia Schmid,et al.  INRIA-LEAR'S Video Copy Detection System , 2008, TRECVID.

[11]  Shree K. Nayar,et al.  Ordinal Measures for Image Correspondence , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Jiri Matas,et al.  Randomized RANSAC with sequential probability ratio test , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Ruud M. Bolle,et al.  Comparison of sequence matching techniques for video copy detection , 2001, IS&T/SPIE Electronic Imaging.

[15]  Olivier Buisson,et al.  Scalable mining of large video databases using copy detection , 2008, ACM Multimedia.

[16]  Mark Sanderson,et al.  Automatic video tagging using content redundancy , 2009, SIGIR.

[17]  O. Chum,et al.  Geometric min-Hashing: Finding a (thick) needle in a haystack , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[19]  Ian Witten,et al.  Data Mining , 2000 .

[20]  Hung-Khoon Tan,et al.  Real-Time Near-Duplicate Elimination for Web Video Search With Content and Context , 2009, IEEE Transactions on Multimedia.

[21]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[22]  Xian-Sheng Hua,et al.  Robust video signature based on ordinal measure , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[23]  C.-C. Jay Kuo,et al.  A suffix array approach to video copy detection in video sharing social networks , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[25]  Olivier Buisson,et al.  Local Behaviours Labelling for Content Based Video Copy Detection , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[26]  James M. Rehg,et al.  Beyond the Euclidean distance: Creating effective visual codebooks using the Histogram Intersection Kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[28]  Li Chen,et al.  Video copy detection: a comparative study , 2007, CIVR '07.

[29]  Nozha Boujemaa,et al.  Generalized histogram intersection kernel for image recognition , 2005, IEEE International Conference on Image Processing 2005.

[30]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.