KDDI LABS and SRI International at TRECVID 2010: Content-Based Copy Detection

We describe our system for Content-Based Copy Detection (CBCD) task submitted to TRECVID 2010. Our system is multi-modal and integrates the results of global visual features, local visual features and audio features to produce the final run results. Each of these features is designed to take care of different aspects of the video transformations. We submitted two runs each for BALANCED as well as NOFA profile: • KDDILabs-SRI.m.balanced.1 • KDDILabs-SRI.m.balanced.2 • KDDILabs-SRI.m.nofa.1 • KDDILabs-SRI.m.nofa.2 These runs all use the same CBCD framework for each of the three modalities and differ only in the parameters for the final integration step. Our CBCD framework has made significant advances in video based copy detection. We introduce novel algorithms to obtain robust results against various transformations: dense-sampling-based global SIFT features, improved indexing methods for both global and local features and handling temporal burstiness. TRECVID 2010 evaluation results show that our system achieves good performance for both detection accuracy especially on NOFA profile and localization accuracy.

[1]  Cordelia Schmid,et al.  Compact Video Description for Copy Detection with Precise Temporal Alignment , 2010, ECCV.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Li Chen,et al.  Video copy detection: a comparative study , 2007, CIVR '07.

[4]  Jaap A. Haitsma,et al.  Robust Audio Hashing for Content Identification , 2001 .

[5]  Marc Pollefeys,et al.  Handling Urban Location Recognition as a 2D Homothetic Problem , 2010, ECCV.

[6]  M. Usman,et al.  Real Time Video Copy Detection under the Environments of Video Degradation and Editing , 2008, 2008 10th International Conference on Advanced Communication Technology.

[7]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Bernd Girod,et al.  Fast geometric re-ranking for image-based retrieval , 2010, 2010 IEEE International Conference on Image Processing.

[9]  Xian-Sheng Hua,et al.  Robust video signature based on ordinal measure , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[10]  A. Aydin Alatan,et al.  Content Based Copy Detection with Coarse Audio-Visual Fingerprints , 2009, 2009 Seventh International Workshop on Content-Based Multimedia Indexing.

[11]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[12]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Cordelia Schmid,et al.  An Image-Based Approach to Video Copy Detection With Spatio-Temporal Post-Filtering , 2010, IEEE Transactions on Multimedia.

[14]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).