Automatic summarization of rushes video using bipartite graphs

In this paper we present a new approach for automatic summarization of rushes, or unstructured video. Our approach is composed of three major steps. First, based on shot and sub-shot segmentations, we filter sub-shots with low information content not likely to be useful in a summary. Second, a method using maximal matching in a bipartite graph is adapted to measure similarity between the remaining shots and to minimize inter-shot redundancy by removing repetitive retake shots common in rushes video. Finally, the presence of faces and motion intensity are characterised in each sub-shot. A measure of how representative the sub-shot is in the context of the overall video is then proposed. Video summaries composed of keyframe slideshows are then generated. In order to evaluate the effectiveness of this approach we re-run the evaluation carried out by TRECVid, using the same dataset and evaluation metrics used in the TRECVid video summarization task in 2007 but with our own assessors. Results show that our approach leads to a significant improvement on our own work in terms of the fraction of the TRECVid summary ground truth included and is competitive with the best of other approaches in TRECVid 2007.

[1]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[2]  Paul Over,et al.  Video shot boundary detection: Seven years of TRECVid activity , 2010, Comput. Vis. Image Underst..

[3]  Zygmunt Pizlo,et al.  Automated video program summarization using speech transcripts , 2006, IEEE Transactions on Multimedia.

[4]  Chengjun Liu,et al.  A Bayesian Discriminating Features Method for Face Detection , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  A. Murat Tekalp,et al.  Two-stage hierarchical video summary extraction to match low-level user browsing preferences , 2003, IEEE Trans. Multim..

[6]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Chong-Wah Ngo,et al.  Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation , 2006, MM '06.

[8]  Daphna Weinshall,et al.  Mosaicing New Views: The Crossed-Slits Projection , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Noel E. O'Connor,et al.  The acetoolbox: low-level audiovisual feature extraction for retrieval and classification , 2005 .

[10]  Noel E. O'Connor,et al.  A hybrid technique for face detection in color images , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[11]  Paul Over,et al.  The trecvid 2007 BBC rushes summarization evaluation pilot , 2007, TVS '07.

[12]  Alan F. Smeaton,et al.  A user-centered approach to rushes summarisation via highlight-detected keyframes , 2007, TVS '07.

[13]  Paul Over,et al.  The trecvid 2008 BBC rushes summarization evaluation , 2008, TVS '08.