A distance measure for repeated takes of one scene

In applications, such as post-production and archiving of audiovisual material, users are confronted with large amounts of redundant unedited raw material, called rushes. Viewing and organizing this material are crucial but time consuming tasks. Typically, multiple but slightly different takes of the same scene can be found in the rushes video. We propose a method for detecting and clustering takes of one scene shot from the same or very similar camera positions. An important subproblem is to determine the similarity of video segments. We propose a distance measure based on the Longest Common Subsequence (LCSS) model. Two variants of the proposed approach, one with a threshold parameter and one with automatically determined threshold, are compared against the Dynamic Time Warping (DTW) distance measure on six videos from the TRECVID 2007 BBC rushes summarization data set. We also evaluate the influence of the applied temporal segmentation method at the input on the results. Applications of the proposed method to automatic skimming and interactive browsing of rushes video are described.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[3]  Werner Bailer,et al.  Skimming rushes video using retake detection , 2007, TVS '07.

[4]  Henning Schulzrinne,et al.  Proceedings of the 12th annual ACM international conference on Multimedia , 2004, MM 2004.

[5]  David A. Forsyth,et al.  Towards auto-documentary: tracking the evolution of news stories , 2004, MULTIMEDIA '04.

[6]  Shih-Fu Chang,et al.  VideoQ: an automated content based video search system using visual cues , 1997, MULTIMEDIA '97.

[7]  Paul Over,et al.  The trecvid 2007 BBC rushes summarization evaluation pilot , 2007, TVS '07.

[8]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[9]  Paul A. Viola,et al.  Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade , 2001, NIPS.

[10]  Tieniu Tan,et al.  Comparison of Similarity Measures for Trajectory Clustering in Outdoor Surveillance Scenes , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[11]  Dimitrios Gunopulos,et al.  Elastic Translation Invariant Matching of Trajectories , 2005, Machine Learning.

[12]  Shih-Fu Chang,et al.  Topic Tracking Across Broadcast News Videos with Visual Duplicates and Semantic Concepts , 2006, 2006 International Conference on Image Processing.

[13]  P. Beek,et al.  Text of 15938-5 FCD Information Technology-Multimedia Content Description Interface-Pard 5 Multimedia Description Schemes , 2001 .

[14]  Sanjeev R. Kulkarni,et al.  A framework for measuring video similarity and its application to video query by example , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[15]  Li Zhao,et al.  Key-frame extraction and shot retrieval using nearest feature line (NFL) , 2000, MULTIMEDIA '00.

[16]  Ahmed K. Elmagarmid,et al.  InsightVideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval , 2005, IEEE Transactions on Multimedia.

[17]  Ruud M. Bolle,et al.  Comparison of sequence matching techniques for video copy detection , 2001, IS&T/SPIE Electronic Imaging.

[18]  Minerva M. Yeung,et al.  Storage and Retrieval for Media Databases 2002 , 2001 .

[19]  Shumeet Baluja,et al.  Advertisement Detection and Replacement using Acoustic and Visual Repetition , 2006, 2006 IEEE Workshop on Multimedia Signal Processing.

[20]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[21]  Stan Salvador,et al.  FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space , 2004 .

[22]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[23]  L. R. Rabiner,et al.  A comparative study of several dynamic time-warping algorithms for connected-word recognition , 1981, The Bell System Technical Journal.

[24]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[25]  Anindya Sarkar,et al.  Feature fusion and redundancy pruning for rush video summarization , 2007, TVS '07.

[26]  Donald A. Adjeroh,et al.  A Distance Measure for Video Sequences , 1999, Comput. Vis. Image Underst..

[27]  Werner Bailer,et al.  Detecting and Clustering Multiple Takes of One Scene , 2008, MMM.

[28]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[29]  Ruud M. Bolle,et al.  Comparison of distance measures for video copy detection , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..