Scalable mining of large video databases using copy detection

Mining the video content itself can bring to light important information regarding the internal structure of large video databases, compensating for a lasting absence of extensive and reliable annotations. Many valuable links between video segments can be identified by content-based copy detection methods, where "copies" are transformed versions of original video sequences. To make this approach viable for large video databases, we put forward a new mining method relying on the definition of a compact keyframe-level descriptor and of a specific index structure. The performance obtained in detecting links between video segments is evaluated with the help of a ground truth and several illustrations are given. The scalability of the approach is then demonstrated for databases of up to 10,000 hours of video.

[1]  John M. Gauch,et al.  Finding and identifying unknown commercials using repeated video sequence detection , 2006, Comput. Vis. Image Underst..

[2]  Sunita Sarawagi,et al.  Efficient set joins on similarity predicates , 2004, SIGMOD '04.

[3]  Chong-Wah Ngo,et al.  Near-duplicate keyframe retrieval with visual keywords and semantic context , 2007, CIVR '07.

[4]  Michael Isard,et al.  General Theory , 1969 .

[5]  Luc Van Gool,et al.  Video mining with frequent itemset configurations , 2006 .

[6]  Shin Satoh News video analysis based on identical shot detection , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[7]  Chong-Wah Ngo,et al.  Novelty detection for cross-lingual news stories with visual duplicates and speech transcripts , 2007, ACM Multimedia.

[8]  Shin'ichi Satoh,et al.  Identification and Detection of the Same Scene Based on Flash Light Patterns , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[9]  Chong-Wah Ngo,et al.  Practical elimination of near-duplicates from web video search , 2007, ACM Multimedia.

[10]  Jun Adachi,et al.  Scene duplicate detection from videos based on trajectories of feature points , 2007, MIR '07.

[11]  Shin'ichi Satoh,et al.  A News Video Browser Using Identical Video Segment Detection , 2004, PCM.

[12]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Roberto J. Bayardo,et al.  Scaling up all pairs similarity search , 2007, WWW '07.

[14]  Olivier Buisson,et al.  Z-grid-based probabilistic retrieval for scaling up content-based copy detection , 2007, CIVR '07.

[15]  Mubarak Shah,et al.  Tracking news stories across different sources , 2005, MULTIMEDIA '05.

[16]  Raghav Kaushik,et al.  Efficient exact set-similarity joins , 2006, VLDB.