Video summarization from spatio-temporal features

In this paper we present a video summarization method based on the study of spatio-temporal activity within the video. The visual activity is estimated by measuring the number of interest points, jointly obtained in the spatial and temporal domains. The proposed approach is composed of five steps. First, image features are collected using the spatio-temporal Hessian matrix. Then, these features are processed to retrieve the candidate video segments for the summary (denoted clips). Further on, two specific steps are designed to first detect the redundant clips, and second to eliminate the clapperboard images. The final step consists in the construction of the final summary which is performed by retaining the clips showing the highest level of activity. The proposed approach was tested on the BBC Rushes Summarization task within the TRECVID 2008 campaign.

[1]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[2]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Harry W. Agius,et al.  Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..

[6]  Paul Over,et al.  The trecvid 2008 BBC rushes summarization evaluation , 2008, TVS '08.

[7]  Peter Kovesi,et al.  Using Space-Time Interest Points for Video Sequence Synchronization , 2007, MVA.

[8]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[9]  Robert Laganière,et al.  Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[10]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.